Mehdi AMINI via llvm-dev
2018-May-03 21:44 UTC
[llvm-dev] RFC: LLVM Assembly format for ThinLTO Summary
Le mar. 1 mai 2018 à 16:50, Teresa Johnson <tejohnson at google.com> a écrit :> Hi Mehdi, thanks for the comments, responses and a tweaked proposal below. > Teresa > > On Tue, May 1, 2018 at 11:37 AM, Mehdi AMINI <joker.eph at gmail.com> wrote: > >> Hi, >> >> My main concern is this one: >> >> > Currently, I am emitting the summary entries at the end, after the >> metadata nodes. Note that the ModuleSummaryIndex is not currently >> referenced from the Module, and isn’t currently created when parsing the >> Module IR bitcode (there is a separate derived class for reading the >> ModuleSummaryIndex from bitcode). This is because they are not currently >> used at the same time. However, in the future there is no reason why we >> couldn’t tag the global values in the Module’s LLVM assembly with the >> corresponding summary entry if the ModuleSummaryIndex is available when >> printing the Module in the assembly writer. I.e. we could do the following >> for “main” from the above example when printing the IR definition (note the >> “^3” at the end): >> >> >> I believe the reason that the ModuleSummaryIndex is not attached to the >> Module is that it is fundamentally not a piece of IR, but it is >> conceptually really an Analysis result. >> Usually other analyses don't serialize their result, we happen to >> serialize this one for an optimization purpose (reloading it and making the >> thin-link faster). >> > > True. My understanding is that the push for having it serialized via > assembly is due to the fact that it is emitted into the bitcode. I know > there is disagreement on this reasoning, I am hoping to have a proposal > that is acceptable to everyone. =) > > > >> The fundamental problem is that an analysis result has to be able to be >> invalidated with IR changes, attaching this directly to the module wouldn't >> achieve this. The risk is that when the IR and the summary get out-of-sync >> (`clang -O2 my_module_with_summaries.ll -emit-llvm -o my_optimized >> module_with_summaries.ll`) the summaries would be badly wrong. >> >> Have you looked into what it'd take to make it a "real" analysis in the >> pass manager? >> > > Thanks for raising this issue specifically, I hadn't addressed it in my > proposal and it is a big one. I am not proposing that we attempt to > maintain the summary through optimization passes, and definitely don't > think we should do that. IMO deserializing it should be for testing the > thin link and the combined summaries in the backends only. To that end, I > have an idea (below some background first). > > Note that in some cases the module summary analysis is an analysis pass. > I.e. when invoked by "opt -module-summary=". However, some time ago when > Peter added the support for splitting the bitcode (for CFI purposes) and > therefore needed to generate a summary in each partition (Thin and > Regular), he added the ThinLTOBitcodeWriterPass, which invokes the module > summary builder directly (twice). This writer is what gets invoked now when > building via "clang -flto=thin", and with "opt -thinlto-bc". So there it is > not invoked/maintained as an analysis pass/result. It would be tricky to > figure out how to even split rather than recompute the module summary index > in that case. Even in the case where we are still invoking as an analysis > pass (opt -module-summary), we would need to figure out how to read in the > module summary to use as the analysis result when available (so that it > could be invalidated and recomputed when stale). > > Rather than add this plumbing, and just have it discarded if opt does any > optimization, I think we should focus at least for the time being on > supporting reading the summary from assembly exactly where we currently > read in the summary from bitcode: > 1) For the thin link (e.g. tools such as llvm-lto2 or llvm-lto, which > currently have to be preceded by "opt -module-summary/-thinlto-bc" to > generate an index, but could just build it from assembly instead). > 2) For the LTO backends (e.g. tools such as llvm-lto which can consume a > combined index and invoke the backends, or "clang -fthinlto-index=" for > distributed ThinLTO backend testing), where we could build the combined > summary index from assembly instead. > > This greatly simplifies the reading side, as there are no optimizations > performed on the IR after the index is read in these cases that would > require invalidation. It also simplifies adding the parsing support, since > it gets invoked exactly where we expect to build an index currently (i.e. > since we don't currently build or store the ModuleSummaryIndex when parsing > the Module from bitcode). It doesn't preclude someone from figuring out how > to compute the module summary analysis result from the assembly, and > invalidating it after optimization, when reading the Module IR via 'opt' in > the future. > > Does this seem like a reasonable proposal to everyone? >Sounds good to me. That would make .ll files quite convenient during debugging I think? We could disassemble, manually change summaries, and re-assemble a bitcode file before running the (thin-)link again. Thanks, -- Mehdi> > Thanks, > Teresa > > > >> Cheers, >> >> -- >> Mehdi >> >> >> >> >> Le mar. 1 mai 2018 à 11:10, Peter Collingbourne <peter at pcc.me.uk> a >> écrit : >> >>> >>> >>> On Tue, May 1, 2018 at 9:00 AM, Teresa Johnson <tejohnson at google.com> >>> wrote: >>> >>>> >>>> >>>> On Mon, Apr 30, 2018 at 10:21 AM Peter Collingbourne <peter at pcc.me.uk> >>>> wrote: >>>> >>>>> On Mon, Apr 30, 2018 at 8:32 AM, Teresa Johnson <tejohnson at google.com> >>>>> wrote: >>>>> >>>>>> Hi Peter, >>>>>> Thanks for your comments, replies below. >>>>>> Teresa >>>>>> >>>>>> On Wed, Apr 25, 2018 at 2:08 PM Peter Collingbourne <peter at pcc.me.uk> >>>>>> wrote: >>>>>> >>>>>>> Hi Teresa, >>>>>>> >>>>>>> Thanks for sending this proposal out. >>>>>>> >>>>>>> I would again like to register my disagreement with the whole idea >>>>>>> of writing summaries in LLVM assembly format. In my view it is clear that >>>>>>> this is not the right direction, as it only invites additional complexity >>>>>>> and more ways for things to go wrong for no real benefit. However, I don't >>>>>>> have the energy to argue that point any further, so I won't stand in the >>>>>>> way here. >>>>>>> >>>>>> >>>>>> I assume you are most concerned with the re-assembly/deserialization >>>>>> of the summary. My main goal is to get this dumped into a text format, and >>>>>> I went this route since the last dumper RFC was blocked with the LLVM >>>>>> assembly direction pushed. >>>>>> >>>>> >>>>> Yes, my main concern is with the deserialization. My view is that it >>>>> should only be allowed for combined summaries -- allowing it for per-module >>>>> is unnecessary as it creates the possibility of things gettting out of >>>>> sync. Given that, we don't actually need an assembly representation and we >>>>> can use whichever format is most convenient. But given the opposition to >>>>> this viewpoint I am willing to concede getting the format (IMHO) right in >>>>> favour of something that is acceptable to others, just so that we can test >>>>> things in a more reasonable way. >>>>> >>>>>> >>>>>> >>>>>>> On Tue, Apr 24, 2018 at 7:43 AM, Teresa Johnson < >>>>>>> tejohnson at google.com> wrote: >>>>>>> >>>>>>>> Hi everyone, >>>>>>>> >>>>>>>> I started working on a long-standing request to have the summary >>>>>>>> dumped in a readable format to text, and specifically to emit to LLVM >>>>>>>> assembly. Proposal below, please let me know your thoughts. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Teresa >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *RFC: LLVM Assembly format for ThinLTO >>>>>>>> Summary========================================Background-----------------ThinLTO >>>>>>>> operates on small summaries computed during the compile step (i.e. with “-c >>>>>>>> -flto=thin”), which are then analyzed and updated during the Thin Link >>>>>>>> stage, and utilized to perform IR updates during the post-link ThinLTO >>>>>>>> backends. The summaries are emitted as LLVM Bitcode, however, not currently >>>>>>>> in the LLVM assembly.There are two ways to generate a bitcode file >>>>>>>> containing summary records for a module: 1. Compile with “clang -c >>>>>>>> -flto=thin”2. Build from LLVM assembly using “opt -module-summary”Either of >>>>>>>> these will result in the ModuleSummaryIndex analysis pass (which builds the >>>>>>>> summary index in memory for a module) to be added to the pipeline just >>>>>>>> before bitcode emission.Additionally, a combined index is created by >>>>>>>> merging all the per-module indexes during the Thin Link, which is >>>>>>>> optionally emitted as a bitcode file.Currently, the only way to view these >>>>>>>> records is via “llvm-bcanalyzer -dump”, then manually decoding the raw >>>>>>>> bitcode dumps.Relatedly, there is YAML reader/writer support for CFI >>>>>>>> related summary fields (-wholeprogramdevirt-read-summary and >>>>>>>> -wholeprogramdevirt-write-summary). Last summer, GSOC student Charles >>>>>>>> Saternos implemented support to dump the summary in YAML from llvm-lto2 >>>>>>>> (D34080), including the rest of the summary fields (D34063), however, there >>>>>>>> was pushback on the related RFC for dumping via YAML or another format >>>>>>>> rather than emitting as LLVM assembly.Goals: 1. Define LLVM assembly format >>>>>>>> for summary index2. Define interaction between parsing of summary from LLVM >>>>>>>> assembly and synthesis of new summary index from IR.3. Implement printing >>>>>>>> and parsing of summary index LLVM assemblyProposed LLVM Assembly >>>>>>>> Format----------------------------------------------There are several top >>>>>>>> level data structures within the ModuleSummaryIndex: 1. >>>>>>>> ModulePathStringTable: Holds the paths to the modules summarized in the >>>>>>>> index (only one entry for per-module indexes and multiple in the combined >>>>>>>> index), along with their hashes (for incremental builds and global >>>>>>>> promotion).2. GlobalValueMap: A map from global value GUIDs to the >>>>>>>> corresponding function/variable/alias summary (or summaries for the >>>>>>>> combined index and weak linkage).3. CFI-related data structures (TypeIdMap, >>>>>>>> CfiFunctionDefs, and CfiFunctionDecls)I have a WIP patch to AsmWriter.cpp >>>>>>>> to print the ModuleSummaryIndex that I was using to play with the format. >>>>>>>> It currently prints 1 and 2 above. I’ve left the CFI related summary data >>>>>>>> structures as a TODO for now, until the format is at least conceptually >>>>>>>> agreed, but from looking at those I don’t see an issue with using the same >>>>>>>> format (with a note/question for Peter on CFI type test representation >>>>>>>> below).I modeled the proposed format on metadata, with a few key >>>>>>>> differences noted below. Like metadata, I propose enumerating the entries >>>>>>>> with the SlotTracker, and prefixing them with a special character. Avoiding >>>>>>>> characters already used in some fashion (i.e. “!” for metadata and “#” for >>>>>>>> attributes), I initially have chosen “^”. Open to suggestions >>>>>>>> though.Consider the following example:extern void foo();int X;int bar() { >>>>>>>> foo(); return X;}void barAlias() __attribute__ ((alias ("bar")));int >>>>>>>> main() { barAlias(); return bar();}The proposed format has one entry per >>>>>>>> ModulePathStringTable entry and one per GlobalValueMap/GUID, and looks >>>>>>>> like:^0 = module: {path: testA.o, hash: 5487197307045666224}^1 = gv: {guid: >>>>>>>> 1881667236089500162, name: X, summaries: {variable: {module: ^0, flags: >>>>>>>> {linkage: common, notEligibleToImport: 0, live: 0, dsoLocal: 1}}}}^2 = gv: >>>>>>>> {guid: 6699318081062747564, name: foo}^3 = gv: {guid: 15822663052811949562, >>>>>>>> name: main, summaries: {function: {module: ^0, flags: {linkage: extern, >>>>>>>> notEligibleToImport: 1, live: 0, dsoLocal: 1}, insts: 5, funcFlags: >>>>>>>> {readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0}, calls: >>>>>>>> {{callee: ^5, hotness: unknown}, {callee: ^4, hotness: unknown}}}}}^4 = gv: >>>>>>>> {guid: 16434608426314478903, name: bar, summaries: {function: {module: ^0, >>>>>>>> flags: {linkage: extern, notEligibleToImport: 1, live: 0, dsoLocal: 1}, >>>>>>>> insts: 3, funcFlags: {readNone: 0, readOnly: 0, noRecurse: 0, >>>>>>>> returnDoesNotAlias: 0}, calls: {{callee: ^2, hotness: unknown}}, refs: >>>>>>>> {^1}}}}^5 = gv: {guid: 18040127437030252312, name: barAlias, summaries: >>>>>>>> {alias: {module: ^0, flags: {linkage: extern, notEligibleToImport: 0, live: >>>>>>>> 0, dsoLocal: 1}, aliasee: ^4}}}Like metadata, the fields are tagged >>>>>>>> (currently using lower camel case, maybe upper camel case would be >>>>>>>> preferable).The proposed format has a structure that reflects the data >>>>>>>> structures in the summary index. For example, consider the entry “^4”. This >>>>>>>> corresponds to the function “bar”. The entry for that GUID in the >>>>>>>> GlobalValueMap contains a list of summaries. For per-module summaries such >>>>>>>> as this, there will be at most one summary (with no summary list for an >>>>>>>> external function like “foo”). In the combined summary there may be >>>>>>>> multiple, e.g. in the case of linkonce_odr functions which have definitions >>>>>>>> in multiple modules. The summary list for bar (“^4”) contains a >>>>>>>> FunctionSummary, so the summary is tagged “function:”. The FunctionSummary >>>>>>>> contains both a flags structure (inherited from the base GlobalValueSummary >>>>>>>> class), and a funcFlags structure (specific to FunctionSummary). It >>>>>>>> therefore contains a brace-enclosed list of flag tags/values for each.Where >>>>>>>> a global value summary references another global value summary (e.g. via a >>>>>>>> call list, reference list, or aliasee), the entry is referenced by its >>>>>>>> slot. E.g. the alias “barAlias” (“^5”) references its aliasee “bar” as >>>>>>>> “^4”.Note that in comparison metadata assembly entries tend to be much more >>>>>>>> decomposed since many metadata fields are themselves metadata (so then >>>>>>>> entries tend to be shorter with references to other metadata >>>>>>>> nodes).Currently, I am emitting the summary entries at the end, after the >>>>>>>> metadata nodes. Note that the ModuleSummaryIndex is not currently >>>>>>>> referenced from the Module, and isn’t currently created when parsing the >>>>>>>> Module IR bitcode (there is a separate derived class for reading the >>>>>>>> ModuleSummaryIndex from bitcode). This is because they are not currently >>>>>>>> used at the same time. However, in the future there is no reason why we >>>>>>>> couldn’t tag the global values in the Module’s LLVM assembly with the >>>>>>>> corresponding summary entry if the ModuleSummaryIndex is available when >>>>>>>> printing the Module in the assembly writer. I.e. we could do the following >>>>>>>> for “main” from the above example when printing the IR definition (note the >>>>>>>> “^3” at the end):define dso_local i32 @main() #0 !dbg !17 ^3 {For CFI data >>>>>>>> structures, the format would be similar. It appears that TypeIds are >>>>>>>> referred to by string name in the top level TypeIdMap (std::map indexed by >>>>>>>> std::string type identifier), whereas they are referenced by GUID within >>>>>>>> the FunctionSummary class (i.e. the TypeTests vector and the VFuncId >>>>>>>> structure). For the LLVM assembly I think there should be a top level entry >>>>>>>> for each TypeIdMap, which lists both the type identifier string and its >>>>>>>> GUID (followed by its associated information stored in the map), and the >>>>>>>> TypeTests/VFuncId references on the FunctionSummary entries can reference >>>>>>>> it by summary slot number. I.e. something like:^1 = typeid: {guid: 12345, >>>>>>>> identifier: name_of_type, …^2 = gv: {... {function: {.... typeTests: {^1, >>>>>>>> …Peter - is that correct and does that sound ok?* >>>>>>>> >>>>>>> >>>>>>> I don't think that would work because the purpose of the top-level >>>>>>> TypeIdMap is to contain resolutions for each type identifier, and >>>>>>> per-module summaries do not contain resolutions (only the combined summary >>>>>>> does). What that means in practice is that we would not be able to recover >>>>>>> and write out a type identifier name for per-module summaries as part of ^1 >>>>>>> in your example (well, we could in principle, because the name is stored >>>>>>> somewhere in the function's IR, but that could get complicated). >>>>>>> >>>>>> >>>>>> Ah ok. I guess the top-level map then is generated by the regular LTO >>>>>> portion of the link (since it presumably requires IR during the Thin Link >>>>>> to get into the combined summary)? >>>>>> >>>>> >>>>> Yes, we fill in the map during the LowerTypeTests and >>>>> WholeProgramDevirt passes in the regular LTO part of the link, e.g. here: >>>>> http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/LowerTypeTests.cpp#823 >>>>> >>>>> >>>>>> Probably the easiest thing to do is to keep the type identifiers as >>>>>>> GUIDs in the function summaries and write out the mapping of type >>>>>>> identifiers as a top-level entity. >>>>>>> >>>>>> >>>>>> To confirm, you mean during the compile step create a top-level >>>>>> entity that maps GUID -> identifier? >>>>>> >>>>> >>>>> I mean that you could represent this with something like: >>>>> >>>>> ^typeids = {^1, ^2, ^3} >>>>> ^1 = typeid: {identifier: typeid1, ...} >>>>> ^2 = typeid: {identifier: typeid2, ...} >>>>> ^3 = typeid: {identifier: typeid3, ...} >>>>> >>>>> There's no need to store the GUIDs here because they can be computed >>>>> from the type identifiers. The GUIDs would only be stored in the typeTests >>>>> (etc.) fields in each function summary. >>>>> >>>> >>>> I suppose we don't need to store the GUIDs at the top level in the >>>> in-memory summary. But I think it would be good to emit the GUIDs in the >>>> typeid assembly entries because it makes the association in the assembly >>>> much more obvious. I.e. going back to my original example: >>>> >>>> ^1 = typeid: {guid: 12345, identifier: name_of_type, … >>>> ^2 = gv: {... {function: {.... typeTests: {^1, … >>>> >>>> If we didn't include the GUID in the typeid entry, but rather just the >>>> identifier, and put the GUID in the typeTest list in the GV's entry, it >>>> wouldn't be obvious at all from the assembly listing which typeid goes with >>>> which typeTest. It's also less compact to include the GUID in each >>>> typeTests list. >>>> >>> >>> I get that, but my point was that in a per-module summary the TypeIdMap >>> is empty, so there will be no names, only GUIDs. >>> >>> For "making the association more obvious" we might just want to have the >>> assembly writer emit the GUID of a name as a comment. >>> >>> >>>> Or perhaps we are saying the same thing - I can't tell from your above >>>> example if the GUID is also emitted in the "typeid:" entries. >>>> >>> >>> No, it wouldn't be. >>> >>> I'm not sure there is a need for the: >>>> ^typeids = {^1, ^2, ^3} >>>> We can just build the typeids list on the fly as " = typeid: " entries >>>> are read in. >>>> >>> >>> That's true. Given that nothing actually needs to refer to them, we can >>> just represent the typeids as something like >>> typeid: {identifier: typeid1, ...} ; guid = 123 >>> typeid: {identifier: typeid2, ...} ; guid = 456 >>> typeid: {identifier: typeid3, ...} ; guid = 789 >>> without an associated number. >>> >>> Peter >>> >>> >>> >>>> >>>> >>>> Teresa >>>> >>>> >>>> >>>>> Peter >>>>> >>>>>> >>>>>> Thanks, >>>>>> Teresa >>>>>> >>>>>> >>>>>>> >>>>>>> Peter >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *Issues when Parsing of Summaries from >>>>>>>> Assembly--------------------------------------------------------------------When >>>>>>>> reading an LLVM assembly file containing module summary entries, a >>>>>>>> ModuleSummaryIndex will be created from the entries.Things to consider are >>>>>>>> the behavior when: - Invoked with “opt -module-summary” (which currently >>>>>>>> builds a new summary index from the IR). Options:1. recompute summary and >>>>>>>> throw away summary in the assembly file2. ignore -module-summary and build >>>>>>>> the summary from the LLVM assembly3. give an error4. compare the two >>>>>>>> summaries (one created from the assembly and the new one created by the >>>>>>>> analysis phase from the IR), and error if they are different.My opinion is >>>>>>>> to do a), so that the behavior using -module-summary doesn’t change. We >>>>>>>> also need a way to force building of a fresh module summary for cases where >>>>>>>> the user has modified the LLVM assembly of the IR (see below). - How to >>>>>>>> handle older LLVM assembly files that don’t contain new summary fields. >>>>>>>> Options:1. Force the LLVM assembly file to be recreated with a new summary. >>>>>>>> I.e. “opt -module-summary -o - | llvm-dis”.2. Auto-upgrade, by silently >>>>>>>> creating conservative values for the new summary entries.I lean towards b) >>>>>>>> (when possible) for user-friendliness and to reduce required churn on test >>>>>>>> inputs. - How to handle partial or incorrect LLVM assembly summary entries. >>>>>>>> How to handle partial summaries depends in part on how we answer the prior >>>>>>>> question about auto-upgrading. I think the best option like there is to >>>>>>>> handle it automatically when possible. However, I do think we should error >>>>>>>> on glaring errors like obviously missing information. For example, when >>>>>>>> there is summary data in the LLVM assembly, but summary entries are missing >>>>>>>> for some global values. E.g. if the user modified the assembly to add a >>>>>>>> function but forgot to add a corresponding summary entry. We could still >>>>>>>> have subtle issues (e.g. user adds a new call but forgets to update the >>>>>>>> caller’s summary call list), but it will be harder to detect those.* >>>>>>>> >>>>>>>> -- >>>>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>>>> 408-460-2413 >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> -- >>>>>>> Peter >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>> 408-460-2413 >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> -- >>>>> Peter >>>>> >>>> >>>> >>>> -- >>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>> 408-460-2413 >>>> >>> >>> >>> >>> -- >>> -- >>> Peter >>> >> > > > -- > Teresa Johnson | Software Engineer | tejohnson at google.com | > 408-460-2413 >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180503/8ec477bb/attachment-0001.html>
Teresa Johnson via llvm-dev
2018-May-03 22:07 UTC
[llvm-dev] RFC: LLVM Assembly format for ThinLTO Summary
On Thu, May 3, 2018 at 2:44 PM, Mehdi AMINI <joker.eph at gmail.com> wrote:> > > Le mar. 1 mai 2018 à 16:50, Teresa Johnson <tejohnson at google.com> a > écrit : > >> Hi Mehdi, thanks for the comments, responses and a tweaked proposal >> below. Teresa >> >> On Tue, May 1, 2018 at 11:37 AM, Mehdi AMINI <joker.eph at gmail.com> wrote: >> >>> Hi, >>> >>> My main concern is this one: >>> >>> > Currently, I am emitting the summary entries at the end, after the >>> metadata nodes. Note that the ModuleSummaryIndex is not currently >>> referenced from the Module, and isn’t currently created when parsing the >>> Module IR bitcode (there is a separate derived class for reading the >>> ModuleSummaryIndex from bitcode). This is because they are not currently >>> used at the same time. However, in the future there is no reason why we >>> couldn’t tag the global values in the Module’s LLVM assembly with the >>> corresponding summary entry if the ModuleSummaryIndex is available when >>> printing the Module in the assembly writer. I.e. we could do the following >>> for “main” from the above example when printing the IR definition (note the >>> “^3” at the end): >>> >>> >>> I believe the reason that the ModuleSummaryIndex is not attached to the >>> Module is that it is fundamentally not a piece of IR, but it is >>> conceptually really an Analysis result. >>> Usually other analyses don't serialize their result, we happen to >>> serialize this one for an optimization purpose (reloading it and making the >>> thin-link faster). >>> >> >> True. My understanding is that the push for having it serialized via >> assembly is due to the fact that it is emitted into the bitcode. I know >> there is disagreement on this reasoning, I am hoping to have a proposal >> that is acceptable to everyone. =) >> >> >> >>> The fundamental problem is that an analysis result has to be able to be >>> invalidated with IR changes, attaching this directly to the module wouldn't >>> achieve this. The risk is that when the IR and the summary get out-of-sync >>> (`clang -O2 my_module_with_summaries.ll -emit-llvm -o my_optimized >>> module_with_summaries.ll`) the summaries would be badly wrong. >>> >>> Have you looked into what it'd take to make it a "real" analysis in the >>> pass manager? >>> >> >> Thanks for raising this issue specifically, I hadn't addressed it in my >> proposal and it is a big one. I am not proposing that we attempt to >> maintain the summary through optimization passes, and definitely don't >> think we should do that. IMO deserializing it should be for testing the >> thin link and the combined summaries in the backends only. To that end, I >> have an idea (below some background first). >> >> Note that in some cases the module summary analysis is an analysis pass. >> I.e. when invoked by "opt -module-summary=". However, some time ago when >> Peter added the support for splitting the bitcode (for CFI purposes) and >> therefore needed to generate a summary in each partition (Thin and >> Regular), he added the ThinLTOBitcodeWriterPass, which invokes the module >> summary builder directly (twice). This writer is what gets invoked now when >> building via "clang -flto=thin", and with "opt -thinlto-bc". So there it is >> not invoked/maintained as an analysis pass/result. It would be tricky to >> figure out how to even split rather than recompute the module summary index >> in that case. Even in the case where we are still invoking as an analysis >> pass (opt -module-summary), we would need to figure out how to read in the >> module summary to use as the analysis result when available (so that it >> could be invalidated and recomputed when stale). >> >> Rather than add this plumbing, and just have it discarded if opt does any >> optimization, I think we should focus at least for the time being on >> supporting reading the summary from assembly exactly where we currently >> read in the summary from bitcode: >> 1) For the thin link (e.g. tools such as llvm-lto2 or llvm-lto, which >> currently have to be preceded by "opt -module-summary/-thinlto-bc" to >> generate an index, but could just build it from assembly instead). >> 2) For the LTO backends (e.g. tools such as llvm-lto which can consume a >> combined index and invoke the backends, or "clang -fthinlto-index=" for >> distributed ThinLTO backend testing), where we could build the combined >> summary index from assembly instead. >> >> This greatly simplifies the reading side, as there are no optimizations >> performed on the IR after the index is read in these cases that would >> require invalidation. It also simplifies adding the parsing support, since >> it gets invoked exactly where we expect to build an index currently (i.e. >> since we don't currently build or store the ModuleSummaryIndex when parsing >> the Module from bitcode). It doesn't preclude someone from figuring out how >> to compute the module summary analysis result from the assembly, and >> invalidating it after optimization, when reading the Module IR via 'opt' in >> the future. >> >> Does this seem like a reasonable proposal to everyone? >> > > Sounds good to me. > > That would make .ll files quite convenient during debugging I think? We > could disassemble, manually change summaries, and re-assemble a bitcode > file before running the (thin-)link again. >Great, yep, that's exactly the idea. Although the tools that invoke the thin link could just consume the manually changed assembly summary files directly (or have a mechanism to re-assemble to a summary-only bitcode file). Teresa> Thanks, > > -- > Mehdi > > > > --Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180503/84b4d6e7/attachment.html>
Peter Collingbourne via llvm-dev
2018-May-03 22:07 UTC
[llvm-dev] RFC: LLVM Assembly format for ThinLTO Summary
On Thu, May 3, 2018 at 2:44 PM, Mehdi AMINI <joker.eph at gmail.com> wrote:> > > Le mar. 1 mai 2018 à 16:50, Teresa Johnson <tejohnson at google.com> a > écrit : > >> Hi Mehdi, thanks for the comments, responses and a tweaked proposal >> below. Teresa >> >> On Tue, May 1, 2018 at 11:37 AM, Mehdi AMINI <joker.eph at gmail.com> wrote: >> >>> Hi, >>> >>> My main concern is this one: >>> >>> > Currently, I am emitting the summary entries at the end, after the >>> metadata nodes. Note that the ModuleSummaryIndex is not currently >>> referenced from the Module, and isn’t currently created when parsing the >>> Module IR bitcode (there is a separate derived class for reading the >>> ModuleSummaryIndex from bitcode). This is because they are not currently >>> used at the same time. However, in the future there is no reason why we >>> couldn’t tag the global values in the Module’s LLVM assembly with the >>> corresponding summary entry if the ModuleSummaryIndex is available when >>> printing the Module in the assembly writer. I.e. we could do the following >>> for “main” from the above example when printing the IR definition (note the >>> “^3” at the end): >>> >>> >>> I believe the reason that the ModuleSummaryIndex is not attached to the >>> Module is that it is fundamentally not a piece of IR, but it is >>> conceptually really an Analysis result. >>> Usually other analyses don't serialize their result, we happen to >>> serialize this one for an optimization purpose (reloading it and making the >>> thin-link faster). >>> >> >> True. My understanding is that the push for having it serialized via >> assembly is due to the fact that it is emitted into the bitcode. I know >> there is disagreement on this reasoning, I am hoping to have a proposal >> that is acceptable to everyone. =) >> >> >> >>> The fundamental problem is that an analysis result has to be able to be >>> invalidated with IR changes, attaching this directly to the module wouldn't >>> achieve this. The risk is that when the IR and the summary get out-of-sync >>> (`clang -O2 my_module_with_summaries.ll -emit-llvm -o my_optimized >>> module_with_summaries.ll`) the summaries would be badly wrong. >>> >>> Have you looked into what it'd take to make it a "real" analysis in the >>> pass manager? >>> >> >> Thanks for raising this issue specifically, I hadn't addressed it in my >> proposal and it is a big one. I am not proposing that we attempt to >> maintain the summary through optimization passes, and definitely don't >> think we should do that. IMO deserializing it should be for testing the >> thin link and the combined summaries in the backends only. To that end, I >> have an idea (below some background first). >> >> Note that in some cases the module summary analysis is an analysis pass. >> I.e. when invoked by "opt -module-summary=". However, some time ago when >> Peter added the support for splitting the bitcode (for CFI purposes) and >> therefore needed to generate a summary in each partition (Thin and >> Regular), he added the ThinLTOBitcodeWriterPass, which invokes the module >> summary builder directly (twice). This writer is what gets invoked now when >> building via "clang -flto=thin", and with "opt -thinlto-bc". So there it is >> not invoked/maintained as an analysis pass/result. It would be tricky to >> figure out how to even split rather than recompute the module summary index >> in that case. Even in the case where we are still invoking as an analysis >> pass (opt -module-summary), we would need to figure out how to read in the >> module summary to use as the analysis result when available (so that it >> could be invalidated and recomputed when stale). >> >> Rather than add this plumbing, and just have it discarded if opt does any >> optimization, I think we should focus at least for the time being on >> supporting reading the summary from assembly exactly where we currently >> read in the summary from bitcode: >> 1) For the thin link (e.g. tools such as llvm-lto2 or llvm-lto, which >> currently have to be preceded by "opt -module-summary/-thinlto-bc" to >> generate an index, but could just build it from assembly instead). >> 2) For the LTO backends (e.g. tools such as llvm-lto which can consume a >> combined index and invoke the backends, or "clang -fthinlto-index=" for >> distributed ThinLTO backend testing), where we could build the combined >> summary index from assembly instead. >> >> This greatly simplifies the reading side, as there are no optimizations >> performed on the IR after the index is read in these cases that would >> require invalidation. It also simplifies adding the parsing support, since >> it gets invoked exactly where we expect to build an index currently (i.e. >> since we don't currently build or store the ModuleSummaryIndex when parsing >> the Module from bitcode). It doesn't preclude someone from figuring out how >> to compute the module summary analysis result from the assembly, and >> invalidating it after optimization, when reading the Module IR via 'opt' in >> the future. >> >> Does this seem like a reasonable proposal to everyone? >> > > Sounds good to me. > > That would make .ll files quite convenient during debugging I think? We > could disassemble, manually change summaries, and re-assemble a bitcode > file before running the (thin-)link again. >Yes, that seems more reasonable than what I thought you had in mind. If the only consumer of this information is llvm-as, then the purpose of the asm summary format is just to provide a way to create a .bc file for testing purposes, which is certainly a useful capability. Peter> Thanks, > > -- > Mehdi > > > > > >> >> Thanks, >> Teresa >> >> >> >>> Cheers, >>> >>> -- >>> Mehdi >>> >>> >>> >>> >>> Le mar. 1 mai 2018 à 11:10, Peter Collingbourne <peter at pcc.me.uk> a >>> écrit : >>> >>>> >>>> >>>> On Tue, May 1, 2018 at 9:00 AM, Teresa Johnson <tejohnson at google.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Mon, Apr 30, 2018 at 10:21 AM Peter Collingbourne <peter at pcc.me.uk> >>>>> wrote: >>>>> >>>>>> On Mon, Apr 30, 2018 at 8:32 AM, Teresa Johnson <tejohnson at google.com >>>>>> > wrote: >>>>>> >>>>>>> Hi Peter, >>>>>>> Thanks for your comments, replies below. >>>>>>> Teresa >>>>>>> >>>>>>> On Wed, Apr 25, 2018 at 2:08 PM Peter Collingbourne <peter at pcc.me.uk> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Teresa, >>>>>>>> >>>>>>>> Thanks for sending this proposal out. >>>>>>>> >>>>>>>> I would again like to register my disagreement with the whole idea >>>>>>>> of writing summaries in LLVM assembly format. In my view it is clear that >>>>>>>> this is not the right direction, as it only invites additional complexity >>>>>>>> and more ways for things to go wrong for no real benefit. However, I don't >>>>>>>> have the energy to argue that point any further, so I won't stand in the >>>>>>>> way here. >>>>>>>> >>>>>>> >>>>>>> I assume you are most concerned with the re-assembly/deserialization >>>>>>> of the summary. My main goal is to get this dumped into a text format, and >>>>>>> I went this route since the last dumper RFC was blocked with the LLVM >>>>>>> assembly direction pushed. >>>>>>> >>>>>> >>>>>> Yes, my main concern is with the deserialization. My view is that it >>>>>> should only be allowed for combined summaries -- allowing it for per-module >>>>>> is unnecessary as it creates the possibility of things gettting out of >>>>>> sync. Given that, we don't actually need an assembly representation and we >>>>>> can use whichever format is most convenient. But given the opposition to >>>>>> this viewpoint I am willing to concede getting the format (IMHO) right in >>>>>> favour of something that is acceptable to others, just so that we can test >>>>>> things in a more reasonable way. >>>>>> >>>>>>> >>>>>>> >>>>>>>> On Tue, Apr 24, 2018 at 7:43 AM, Teresa Johnson < >>>>>>>> tejohnson at google.com> wrote: >>>>>>>> >>>>>>>>> Hi everyone, >>>>>>>>> >>>>>>>>> I started working on a long-standing request to have the summary >>>>>>>>> dumped in a readable format to text, and specifically to emit to LLVM >>>>>>>>> assembly. Proposal below, please let me know your thoughts. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Teresa >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> *RFC: LLVM Assembly format for ThinLTO >>>>>>>>> Summary========================================Background-----------------ThinLTO >>>>>>>>> operates on small summaries computed during the compile step (i.e. with “-c >>>>>>>>> -flto=thin”), which are then analyzed and updated during the Thin Link >>>>>>>>> stage, and utilized to perform IR updates during the post-link ThinLTO >>>>>>>>> backends. The summaries are emitted as LLVM Bitcode, however, not currently >>>>>>>>> in the LLVM assembly.There are two ways to generate a bitcode file >>>>>>>>> containing summary records for a module: 1. Compile with “clang -c >>>>>>>>> -flto=thin”2. Build from LLVM assembly using “opt -module-summary”Either of >>>>>>>>> these will result in the ModuleSummaryIndex analysis pass (which builds the >>>>>>>>> summary index in memory for a module) to be added to the pipeline just >>>>>>>>> before bitcode emission.Additionally, a combined index is created by >>>>>>>>> merging all the per-module indexes during the Thin Link, which is >>>>>>>>> optionally emitted as a bitcode file.Currently, the only way to view these >>>>>>>>> records is via “llvm-bcanalyzer -dump”, then manually decoding the raw >>>>>>>>> bitcode dumps.Relatedly, there is YAML reader/writer support for CFI >>>>>>>>> related summary fields (-wholeprogramdevirt-read-summary and >>>>>>>>> -wholeprogramdevirt-write-summary). Last summer, GSOC student Charles >>>>>>>>> Saternos implemented support to dump the summary in YAML from llvm-lto2 >>>>>>>>> (D34080), including the rest of the summary fields (D34063), however, there >>>>>>>>> was pushback on the related RFC for dumping via YAML or another format >>>>>>>>> rather than emitting as LLVM assembly.Goals: 1. Define LLVM assembly format >>>>>>>>> for summary index2. Define interaction between parsing of summary from LLVM >>>>>>>>> assembly and synthesis of new summary index from IR.3. Implement printing >>>>>>>>> and parsing of summary index LLVM assemblyProposed LLVM Assembly >>>>>>>>> Format----------------------------------------------There are several top >>>>>>>>> level data structures within the ModuleSummaryIndex: 1. >>>>>>>>> ModulePathStringTable: Holds the paths to the modules summarized in the >>>>>>>>> index (only one entry for per-module indexes and multiple in the combined >>>>>>>>> index), along with their hashes (for incremental builds and global >>>>>>>>> promotion).2. GlobalValueMap: A map from global value GUIDs to the >>>>>>>>> corresponding function/variable/alias summary (or summaries for the >>>>>>>>> combined index and weak linkage).3. CFI-related data structures (TypeIdMap, >>>>>>>>> CfiFunctionDefs, and CfiFunctionDecls)I have a WIP patch to AsmWriter.cpp >>>>>>>>> to print the ModuleSummaryIndex that I was using to play with the format. >>>>>>>>> It currently prints 1 and 2 above. I’ve left the CFI related summary data >>>>>>>>> structures as a TODO for now, until the format is at least conceptually >>>>>>>>> agreed, but from looking at those I don’t see an issue with using the same >>>>>>>>> format (with a note/question for Peter on CFI type test representation >>>>>>>>> below).I modeled the proposed format on metadata, with a few key >>>>>>>>> differences noted below. Like metadata, I propose enumerating the entries >>>>>>>>> with the SlotTracker, and prefixing them with a special character. Avoiding >>>>>>>>> characters already used in some fashion (i.e. “!” for metadata and “#” for >>>>>>>>> attributes), I initially have chosen “^”. Open to suggestions >>>>>>>>> though.Consider the following example:extern void foo();int X;int bar() { >>>>>>>>> foo(); return X;}void barAlias() __attribute__ ((alias ("bar")));int >>>>>>>>> main() { barAlias(); return bar();}The proposed format has one entry per >>>>>>>>> ModulePathStringTable entry and one per GlobalValueMap/GUID, and looks >>>>>>>>> like:^0 = module: {path: testA.o, hash: 5487197307045666224}^1 = gv: {guid: >>>>>>>>> 1881667236089500162, name: X, summaries: {variable: {module: ^0, flags: >>>>>>>>> {linkage: common, notEligibleToImport: 0, live: 0, dsoLocal: 1}}}}^2 = gv: >>>>>>>>> {guid: 6699318081062747564, name: foo}^3 = gv: {guid: 15822663052811949562, >>>>>>>>> name: main, summaries: {function: {module: ^0, flags: {linkage: extern, >>>>>>>>> notEligibleToImport: 1, live: 0, dsoLocal: 1}, insts: 5, funcFlags: >>>>>>>>> {readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0}, calls: >>>>>>>>> {{callee: ^5, hotness: unknown}, {callee: ^4, hotness: unknown}}}}}^4 = gv: >>>>>>>>> {guid: 16434608426314478903, name: bar, summaries: {function: {module: ^0, >>>>>>>>> flags: {linkage: extern, notEligibleToImport: 1, live: 0, dsoLocal: 1}, >>>>>>>>> insts: 3, funcFlags: {readNone: 0, readOnly: 0, noRecurse: 0, >>>>>>>>> returnDoesNotAlias: 0}, calls: {{callee: ^2, hotness: unknown}}, refs: >>>>>>>>> {^1}}}}^5 = gv: {guid: 18040127437030252312, name: barAlias, summaries: >>>>>>>>> {alias: {module: ^0, flags: {linkage: extern, notEligibleToImport: 0, live: >>>>>>>>> 0, dsoLocal: 1}, aliasee: ^4}}}Like metadata, the fields are tagged >>>>>>>>> (currently using lower camel case, maybe upper camel case would be >>>>>>>>> preferable).The proposed format has a structure that reflects the data >>>>>>>>> structures in the summary index. For example, consider the entry “^4”. This >>>>>>>>> corresponds to the function “bar”. The entry for that GUID in the >>>>>>>>> GlobalValueMap contains a list of summaries. For per-module summaries such >>>>>>>>> as this, there will be at most one summary (with no summary list for an >>>>>>>>> external function like “foo”). In the combined summary there may be >>>>>>>>> multiple, e.g. in the case of linkonce_odr functions which have definitions >>>>>>>>> in multiple modules. The summary list for bar (“^4”) contains a >>>>>>>>> FunctionSummary, so the summary is tagged “function:”. The FunctionSummary >>>>>>>>> contains both a flags structure (inherited from the base GlobalValueSummary >>>>>>>>> class), and a funcFlags structure (specific to FunctionSummary). It >>>>>>>>> therefore contains a brace-enclosed list of flag tags/values for each.Where >>>>>>>>> a global value summary references another global value summary (e.g. via a >>>>>>>>> call list, reference list, or aliasee), the entry is referenced by its >>>>>>>>> slot. E.g. the alias “barAlias” (“^5”) references its aliasee “bar” as >>>>>>>>> “^4”.Note that in comparison metadata assembly entries tend to be much more >>>>>>>>> decomposed since many metadata fields are themselves metadata (so then >>>>>>>>> entries tend to be shorter with references to other metadata >>>>>>>>> nodes).Currently, I am emitting the summary entries at the end, after the >>>>>>>>> metadata nodes. Note that the ModuleSummaryIndex is not currently >>>>>>>>> referenced from the Module, and isn’t currently created when parsing the >>>>>>>>> Module IR bitcode (there is a separate derived class for reading the >>>>>>>>> ModuleSummaryIndex from bitcode). This is because they are not currently >>>>>>>>> used at the same time. However, in the future there is no reason why we >>>>>>>>> couldn’t tag the global values in the Module’s LLVM assembly with the >>>>>>>>> corresponding summary entry if the ModuleSummaryIndex is available when >>>>>>>>> printing the Module in the assembly writer. I.e. we could do the following >>>>>>>>> for “main” from the above example when printing the IR definition (note the >>>>>>>>> “^3” at the end):define dso_local i32 @main() #0 !dbg !17 ^3 {For CFI data >>>>>>>>> structures, the format would be similar. It appears that TypeIds are >>>>>>>>> referred to by string name in the top level TypeIdMap (std::map indexed by >>>>>>>>> std::string type identifier), whereas they are referenced by GUID within >>>>>>>>> the FunctionSummary class (i.e. the TypeTests vector and the VFuncId >>>>>>>>> structure). For the LLVM assembly I think there should be a top level entry >>>>>>>>> for each TypeIdMap, which lists both the type identifier string and its >>>>>>>>> GUID (followed by its associated information stored in the map), and the >>>>>>>>> TypeTests/VFuncId references on the FunctionSummary entries can reference >>>>>>>>> it by summary slot number. I.e. something like:^1 = typeid: {guid: 12345, >>>>>>>>> identifier: name_of_type, …^2 = gv: {... {function: {.... typeTests: {^1, >>>>>>>>> …Peter - is that correct and does that sound ok?* >>>>>>>>> >>>>>>>> >>>>>>>> I don't think that would work because the purpose of the top-level >>>>>>>> TypeIdMap is to contain resolutions for each type identifier, and >>>>>>>> per-module summaries do not contain resolutions (only the combined summary >>>>>>>> does). What that means in practice is that we would not be able to recover >>>>>>>> and write out a type identifier name for per-module summaries as part of ^1 >>>>>>>> in your example (well, we could in principle, because the name is stored >>>>>>>> somewhere in the function's IR, but that could get complicated). >>>>>>>> >>>>>>> >>>>>>> Ah ok. I guess the top-level map then is generated by the regular >>>>>>> LTO portion of the link (since it presumably requires IR during the Thin >>>>>>> Link to get into the combined summary)? >>>>>>> >>>>>> >>>>>> Yes, we fill in the map during the LowerTypeTests and >>>>>> WholeProgramDevirt passes in the regular LTO part of the link, e.g. here: >>>>>> http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/LowerTypeTests.cpp#823 >>>>>> >>>>>> >>>>>>> Probably the easiest thing to do is to keep the type identifiers as >>>>>>>> GUIDs in the function summaries and write out the mapping of type >>>>>>>> identifiers as a top-level entity. >>>>>>>> >>>>>>> >>>>>>> To confirm, you mean during the compile step create a top-level >>>>>>> entity that maps GUID -> identifier? >>>>>>> >>>>>> >>>>>> I mean that you could represent this with something like: >>>>>> >>>>>> ^typeids = {^1, ^2, ^3} >>>>>> ^1 = typeid: {identifier: typeid1, ...} >>>>>> ^2 = typeid: {identifier: typeid2, ...} >>>>>> ^3 = typeid: {identifier: typeid3, ...} >>>>>> >>>>>> There's no need to store the GUIDs here because they can be computed >>>>>> from the type identifiers. The GUIDs would only be stored in the typeTests >>>>>> (etc.) fields in each function summary. >>>>>> >>>>> >>>>> I suppose we don't need to store the GUIDs at the top level in the >>>>> in-memory summary. But I think it would be good to emit the GUIDs in the >>>>> typeid assembly entries because it makes the association in the assembly >>>>> much more obvious. I.e. going back to my original example: >>>>> >>>>> ^1 = typeid: {guid: 12345, identifier: name_of_type, … >>>>> ^2 = gv: {... {function: {.... typeTests: {^1, … >>>>> >>>>> If we didn't include the GUID in the typeid entry, but rather just the >>>>> identifier, and put the GUID in the typeTest list in the GV's entry, it >>>>> wouldn't be obvious at all from the assembly listing which typeid goes with >>>>> which typeTest. It's also less compact to include the GUID in each >>>>> typeTests list. >>>>> >>>> >>>> I get that, but my point was that in a per-module summary the TypeIdMap >>>> is empty, so there will be no names, only GUIDs. >>>> >>>> For "making the association more obvious" we might just want to have >>>> the assembly writer emit the GUID of a name as a comment. >>>> >>>> >>>>> Or perhaps we are saying the same thing - I can't tell from your above >>>>> example if the GUID is also emitted in the "typeid:" entries. >>>>> >>>> >>>> No, it wouldn't be. >>>> >>>> I'm not sure there is a need for the: >>>>> ^typeids = {^1, ^2, ^3} >>>>> We can just build the typeids list on the fly as " = typeid: " entries >>>>> are read in. >>>>> >>>> >>>> That's true. Given that nothing actually needs to refer to them, we can >>>> just represent the typeids as something like >>>> typeid: {identifier: typeid1, ...} ; guid = 123 >>>> typeid: {identifier: typeid2, ...} ; guid = 456 >>>> typeid: {identifier: typeid3, ...} ; guid = 789 >>>> without an associated number. >>>> >>>> Peter >>>> >>>> >>>> >>>>> >>>>> >>>>> Teresa >>>>> >>>>> >>>>> >>>>>> Peter >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Teresa >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Peter >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> *Issues when Parsing of Summaries from >>>>>>>>> Assembly--------------------------------------------------------------------When >>>>>>>>> reading an LLVM assembly file containing module summary entries, a >>>>>>>>> ModuleSummaryIndex will be created from the entries.Things to consider are >>>>>>>>> the behavior when: - Invoked with “opt -module-summary” (which currently >>>>>>>>> builds a new summary index from the IR). Options:1. recompute summary and >>>>>>>>> throw away summary in the assembly file2. ignore -module-summary and build >>>>>>>>> the summary from the LLVM assembly3. give an error4. compare the two >>>>>>>>> summaries (one created from the assembly and the new one created by the >>>>>>>>> analysis phase from the IR), and error if they are different.My opinion is >>>>>>>>> to do a), so that the behavior using -module-summary doesn’t change. We >>>>>>>>> also need a way to force building of a fresh module summary for cases where >>>>>>>>> the user has modified the LLVM assembly of the IR (see below). - How to >>>>>>>>> handle older LLVM assembly files that don’t contain new summary fields. >>>>>>>>> Options:1. Force the LLVM assembly file to be recreated with a new summary. >>>>>>>>> I.e. “opt -module-summary -o - | llvm-dis”.2. Auto-upgrade, by silently >>>>>>>>> creating conservative values for the new summary entries.I lean towards b) >>>>>>>>> (when possible) for user-friendliness and to reduce required churn on test >>>>>>>>> inputs. - How to handle partial or incorrect LLVM assembly summary entries. >>>>>>>>> How to handle partial summaries depends in part on how we answer the prior >>>>>>>>> question about auto-upgrading. I think the best option like there is to >>>>>>>>> handle it automatically when possible. However, I do think we should error >>>>>>>>> on glaring errors like obviously missing information. For example, when >>>>>>>>> there is summary data in the LLVM assembly, but summary entries are missing >>>>>>>>> for some global values. E.g. if the user modified the assembly to add a >>>>>>>>> function but forgot to add a corresponding summary entry. We could still >>>>>>>>> have subtle issues (e.g. user adds a new call but forgets to update the >>>>>>>>> caller’s summary call list), but it will be harder to detect those.* >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>>>>> 408-460-2413 >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> -- >>>>>>>> Peter >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>>> 408-460-2413 >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> -- >>>>>> Peter >>>>>> >>>>> >>>>> >>>>> -- >>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>> 408-460-2413 >>>>> >>>> >>>> >>>> >>>> -- >>>> -- >>>> Peter >>>> >>> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson at google.com | >> 408-460-2413 >> >-- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180503/1abb9da1/attachment-0001.html>
David Blaikie via llvm-dev
2018-May-03 22:29 UTC
[llvm-dev] RFC: LLVM Assembly format for ThinLTO Summary
On Thu, May 3, 2018 at 3:08 PM Peter Collingbourne via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On Thu, May 3, 2018 at 2:44 PM, Mehdi AMINI <joker.eph at gmail.com> wrote: > >> >> >> Le mar. 1 mai 2018 à 16:50, Teresa Johnson <tejohnson at google.com> a >> écrit : >> >>> Hi Mehdi, thanks for the comments, responses and a tweaked proposal >>> below. Teresa >>> >>> On Tue, May 1, 2018 at 11:37 AM, Mehdi AMINI <joker.eph at gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> My main concern is this one: >>>> >>>> > Currently, I am emitting the summary entries at the end, after the >>>> metadata nodes. Note that the ModuleSummaryIndex is not currently >>>> referenced from the Module, and isn’t currently created when parsing the >>>> Module IR bitcode (there is a separate derived class for reading the >>>> ModuleSummaryIndex from bitcode). This is because they are not currently >>>> used at the same time. However, in the future there is no reason why we >>>> couldn’t tag the global values in the Module’s LLVM assembly with the >>>> corresponding summary entry if the ModuleSummaryIndex is available when >>>> printing the Module in the assembly writer. I.e. we could do the following >>>> for “main” from the above example when printing the IR definition (note the >>>> “^3” at the end): >>>> >>>> >>>> I believe the reason that the ModuleSummaryIndex is not attached to the >>>> Module is that it is fundamentally not a piece of IR, but it is >>>> conceptually really an Analysis result. >>>> Usually other analyses don't serialize their result, we happen to >>>> serialize this one for an optimization purpose (reloading it and making the >>>> thin-link faster). >>>> >>> >>> True. My understanding is that the push for having it serialized via >>> assembly is due to the fact that it is emitted into the bitcode. I know >>> there is disagreement on this reasoning, I am hoping to have a proposal >>> that is acceptable to everyone. =) >>> >>> >>> >>>> The fundamental problem is that an analysis result has to be able to be >>>> invalidated with IR changes, attaching this directly to the module wouldn't >>>> achieve this. The risk is that when the IR and the summary get out-of-sync >>>> (`clang -O2 my_module_with_summaries.ll -emit-llvm -o my_optimized >>>> module_with_summaries.ll`) the summaries would be badly wrong. >>>> >>>> Have you looked into what it'd take to make it a "real" analysis in the >>>> pass manager? >>>> >>> >>> Thanks for raising this issue specifically, I hadn't addressed it in my >>> proposal and it is a big one. I am not proposing that we attempt to >>> maintain the summary through optimization passes, and definitely don't >>> think we should do that. IMO deserializing it should be for testing the >>> thin link and the combined summaries in the backends only. To that end, I >>> have an idea (below some background first). >>> >>> Note that in some cases the module summary analysis is an analysis pass. >>> I.e. when invoked by "opt -module-summary=". However, some time ago when >>> Peter added the support for splitting the bitcode (for CFI purposes) and >>> therefore needed to generate a summary in each partition (Thin and >>> Regular), he added the ThinLTOBitcodeWriterPass, which invokes the module >>> summary builder directly (twice). This writer is what gets invoked now when >>> building via "clang -flto=thin", and with "opt -thinlto-bc". So there it is >>> not invoked/maintained as an analysis pass/result. It would be tricky to >>> figure out how to even split rather than recompute the module summary index >>> in that case. Even in the case where we are still invoking as an analysis >>> pass (opt -module-summary), we would need to figure out how to read in the >>> module summary to use as the analysis result when available (so that it >>> could be invalidated and recomputed when stale). >>> >>> Rather than add this plumbing, and just have it discarded if opt does >>> any optimization, I think we should focus at least for the time being on >>> supporting reading the summary from assembly exactly where we currently >>> read in the summary from bitcode: >>> 1) For the thin link (e.g. tools such as llvm-lto2 or llvm-lto, which >>> currently have to be preceded by "opt -module-summary/-thinlto-bc" to >>> generate an index, but could just build it from assembly instead). >>> 2) For the LTO backends (e.g. tools such as llvm-lto which can consume a >>> combined index and invoke the backends, or "clang -fthinlto-index=" for >>> distributed ThinLTO backend testing), where we could build the combined >>> summary index from assembly instead. >>> >>> This greatly simplifies the reading side, as there are no optimizations >>> performed on the IR after the index is read in these cases that would >>> require invalidation. It also simplifies adding the parsing support, since >>> it gets invoked exactly where we expect to build an index currently (i.e. >>> since we don't currently build or store the ModuleSummaryIndex when parsing >>> the Module from bitcode). It doesn't preclude someone from figuring out how >>> to compute the module summary analysis result from the assembly, and >>> invalidating it after optimization, when reading the Module IR via 'opt' in >>> the future. >>> >>> Does this seem like a reasonable proposal to everyone? >>> >> >> Sounds good to me. >> >> That would make .ll files quite convenient during debugging I think? We >> could disassemble, manually change summaries, and re-assemble a bitcode >> file before running the (thin-)link again. >> > > Yes, that seems more reasonable than what I thought you had in mind. If > the only consumer of this information is llvm-as, then the purpose of the > asm summary format is just to provide a way to create a .bc file for > testing purposes, which is certainly a useful capability. >That feels a bit surprising if ".ll -> llvm-as -> .bc -> <some other tool - opt, etc>" is different from ".ll -> <opt, etc>". Is that what we're talking about here? Any chance that can be avoided & feeding a .ll file works (in the sense of does the same thing/tests the same behavior) in all the same places that feeding a .bc file does? (as is the case with non-summary-based IR, to the best of my knowledge)> > Peter > > >> Thanks, >> >> -- >> Mehdi >> >> >> >> >> >>> >>> Thanks, >>> Teresa >>> >>> >>> >>>> Cheers, >>>> >>>> -- >>>> Mehdi >>>> >>>> >>>> >>>> >>>> Le mar. 1 mai 2018 à 11:10, Peter Collingbourne <peter at pcc.me.uk> a >>>> écrit : >>>> >>>>> >>>>> >>>>> On Tue, May 1, 2018 at 9:00 AM, Teresa Johnson <tejohnson at google.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Mon, Apr 30, 2018 at 10:21 AM Peter Collingbourne <peter at pcc.me.uk> >>>>>> wrote: >>>>>> >>>>>>> On Mon, Apr 30, 2018 at 8:32 AM, Teresa Johnson < >>>>>>> tejohnson at google.com> wrote: >>>>>>> >>>>>>>> Hi Peter, >>>>>>>> Thanks for your comments, replies below. >>>>>>>> Teresa >>>>>>>> >>>>>>>> On Wed, Apr 25, 2018 at 2:08 PM Peter Collingbourne < >>>>>>>> peter at pcc.me.uk> wrote: >>>>>>>> >>>>>>>>> Hi Teresa, >>>>>>>>> >>>>>>>>> Thanks for sending this proposal out. >>>>>>>>> >>>>>>>>> I would again like to register my disagreement with the whole idea >>>>>>>>> of writing summaries in LLVM assembly format. In my view it is clear that >>>>>>>>> this is not the right direction, as it only invites additional complexity >>>>>>>>> and more ways for things to go wrong for no real benefit. However, I don't >>>>>>>>> have the energy to argue that point any further, so I won't stand in the >>>>>>>>> way here. >>>>>>>>> >>>>>>>> >>>>>>>> I assume you are most concerned with the >>>>>>>> re-assembly/deserialization of the summary. My main goal is to get this >>>>>>>> dumped into a text format, and I went this route since the last dumper RFC >>>>>>>> was blocked with the LLVM assembly direction pushed. >>>>>>>> >>>>>>> >>>>>>> Yes, my main concern is with the deserialization. My view is that it >>>>>>> should only be allowed for combined summaries -- allowing it for per-module >>>>>>> is unnecessary as it creates the possibility of things gettting out of >>>>>>> sync. Given that, we don't actually need an assembly representation and we >>>>>>> can use whichever format is most convenient. But given the opposition to >>>>>>> this viewpoint I am willing to concede getting the format (IMHO) right in >>>>>>> favour of something that is acceptable to others, just so that we can test >>>>>>> things in a more reasonable way. >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> On Tue, Apr 24, 2018 at 7:43 AM, Teresa Johnson < >>>>>>>>> tejohnson at google.com> wrote: >>>>>>>>> >>>>>>>>>> Hi everyone, >>>>>>>>>> >>>>>>>>>> I started working on a long-standing request to have the summary >>>>>>>>>> dumped in a readable format to text, and specifically to emit to LLVM >>>>>>>>>> assembly. Proposal below, please let me know your thoughts. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Teresa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *RFC: LLVM Assembly format for ThinLTO >>>>>>>>>> Summary========================================Background-----------------ThinLTO >>>>>>>>>> operates on small summaries computed during the compile step (i.e. with “-c >>>>>>>>>> -flto=thin”), which are then analyzed and updated during the Thin Link >>>>>>>>>> stage, and utilized to perform IR updates during the post-link ThinLTO >>>>>>>>>> backends. The summaries are emitted as LLVM Bitcode, however, not currently >>>>>>>>>> in the LLVM assembly.There are two ways to generate a bitcode file >>>>>>>>>> containing summary records for a module: 1. Compile with “clang -c >>>>>>>>>> -flto=thin”2. Build from LLVM assembly using “opt -module-summary”Either of >>>>>>>>>> these will result in the ModuleSummaryIndex analysis pass (which builds the >>>>>>>>>> summary index in memory for a module) to be added to the pipeline just >>>>>>>>>> before bitcode emission.Additionally, a combined index is created by >>>>>>>>>> merging all the per-module indexes during the Thin Link, which is >>>>>>>>>> optionally emitted as a bitcode file.Currently, the only way to view these >>>>>>>>>> records is via “llvm-bcanalyzer -dump”, then manually decoding the raw >>>>>>>>>> bitcode dumps.Relatedly, there is YAML reader/writer support for CFI >>>>>>>>>> related summary fields (-wholeprogramdevirt-read-summary and >>>>>>>>>> -wholeprogramdevirt-write-summary). Last summer, GSOC student Charles >>>>>>>>>> Saternos implemented support to dump the summary in YAML from llvm-lto2 >>>>>>>>>> (D34080), including the rest of the summary fields (D34063), however, there >>>>>>>>>> was pushback on the related RFC for dumping via YAML or another format >>>>>>>>>> rather than emitting as LLVM assembly.Goals: 1. Define LLVM assembly format >>>>>>>>>> for summary index2. Define interaction between parsing of summary from LLVM >>>>>>>>>> assembly and synthesis of new summary index from IR.3. Implement printing >>>>>>>>>> and parsing of summary index LLVM assemblyProposed LLVM Assembly >>>>>>>>>> Format----------------------------------------------There are several top >>>>>>>>>> level data structures within the ModuleSummaryIndex: 1. >>>>>>>>>> ModulePathStringTable: Holds the paths to the modules summarized in the >>>>>>>>>> index (only one entry for per-module indexes and multiple in the combined >>>>>>>>>> index), along with their hashes (for incremental builds and global >>>>>>>>>> promotion).2. GlobalValueMap: A map from global value GUIDs to the >>>>>>>>>> corresponding function/variable/alias summary (or summaries for the >>>>>>>>>> combined index and weak linkage).3. CFI-related data structures (TypeIdMap, >>>>>>>>>> CfiFunctionDefs, and CfiFunctionDecls)I have a WIP patch to AsmWriter.cpp >>>>>>>>>> to print the ModuleSummaryIndex that I was using to play with the format. >>>>>>>>>> It currently prints 1 and 2 above. I’ve left the CFI related summary data >>>>>>>>>> structures as a TODO for now, until the format is at least conceptually >>>>>>>>>> agreed, but from looking at those I don’t see an issue with using the same >>>>>>>>>> format (with a note/question for Peter on CFI type test representation >>>>>>>>>> below).I modeled the proposed format on metadata, with a few key >>>>>>>>>> differences noted below. Like metadata, I propose enumerating the entries >>>>>>>>>> with the SlotTracker, and prefixing them with a special character. Avoiding >>>>>>>>>> characters already used in some fashion (i.e. “!” for metadata and “#” for >>>>>>>>>> attributes), I initially have chosen “^”. Open to suggestions >>>>>>>>>> though.Consider the following example:extern void foo();int X;int bar() { >>>>>>>>>> foo(); return X;}void barAlias() __attribute__ ((alias ("bar")));int >>>>>>>>>> main() { barAlias(); return bar();}The proposed format has one entry per >>>>>>>>>> ModulePathStringTable entry and one per GlobalValueMap/GUID, and looks >>>>>>>>>> like:^0 = module: {path: testA.o, hash: 5487197307045666224}^1 = gv: {guid: >>>>>>>>>> 1881667236089500162, name: X, summaries: {variable: {module: ^0, flags: >>>>>>>>>> {linkage: common, notEligibleToImport: 0, live: 0, dsoLocal: 1}}}}^2 = gv: >>>>>>>>>> {guid: 6699318081062747564, name: foo}^3 = gv: {guid: 15822663052811949562, >>>>>>>>>> name: main, summaries: {function: {module: ^0, flags: {linkage: extern, >>>>>>>>>> notEligibleToImport: 1, live: 0, dsoLocal: 1}, insts: 5, funcFlags: >>>>>>>>>> {readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0}, calls: >>>>>>>>>> {{callee: ^5, hotness: unknown}, {callee: ^4, hotness: unknown}}}}}^4 = gv: >>>>>>>>>> {guid: 16434608426314478903, name: bar, summaries: {function: {module: ^0, >>>>>>>>>> flags: {linkage: extern, notEligibleToImport: 1, live: 0, dsoLocal: 1}, >>>>>>>>>> insts: 3, funcFlags: {readNone: 0, readOnly: 0, noRecurse: 0, >>>>>>>>>> returnDoesNotAlias: 0}, calls: {{callee: ^2, hotness: unknown}}, refs: >>>>>>>>>> {^1}}}}^5 = gv: {guid: 18040127437030252312, name: barAlias, summaries: >>>>>>>>>> {alias: {module: ^0, flags: {linkage: extern, notEligibleToImport: 0, live: >>>>>>>>>> 0, dsoLocal: 1}, aliasee: ^4}}}Like metadata, the fields are tagged >>>>>>>>>> (currently using lower camel case, maybe upper camel case would be >>>>>>>>>> preferable).The proposed format has a structure that reflects the data >>>>>>>>>> structures in the summary index. For example, consider the entry “^4”. This >>>>>>>>>> corresponds to the function “bar”. The entry for that GUID in the >>>>>>>>>> GlobalValueMap contains a list of summaries. For per-module summaries such >>>>>>>>>> as this, there will be at most one summary (with no summary list for an >>>>>>>>>> external function like “foo”). In the combined summary there may be >>>>>>>>>> multiple, e.g. in the case of linkonce_odr functions which have definitions >>>>>>>>>> in multiple modules. The summary list for bar (“^4”) contains a >>>>>>>>>> FunctionSummary, so the summary is tagged “function:”. The FunctionSummary >>>>>>>>>> contains both a flags structure (inherited from the base GlobalValueSummary >>>>>>>>>> class), and a funcFlags structure (specific to FunctionSummary). It >>>>>>>>>> therefore contains a brace-enclosed list of flag tags/values for each.Where >>>>>>>>>> a global value summary references another global value summary (e.g. via a >>>>>>>>>> call list, reference list, or aliasee), the entry is referenced by its >>>>>>>>>> slot. E.g. the alias “barAlias” (“^5”) references its aliasee “bar” as >>>>>>>>>> “^4”.Note that in comparison metadata assembly entries tend to be much more >>>>>>>>>> decomposed since many metadata fields are themselves metadata (so then >>>>>>>>>> entries tend to be shorter with references to other metadata >>>>>>>>>> nodes).Currently, I am emitting the summary entries at the end, after the >>>>>>>>>> metadata nodes. Note that the ModuleSummaryIndex is not currently >>>>>>>>>> referenced from the Module, and isn’t currently created when parsing the >>>>>>>>>> Module IR bitcode (there is a separate derived class for reading the >>>>>>>>>> ModuleSummaryIndex from bitcode). This is because they are not currently >>>>>>>>>> used at the same time. However, in the future there is no reason why we >>>>>>>>>> couldn’t tag the global values in the Module’s LLVM assembly with the >>>>>>>>>> corresponding summary entry if the ModuleSummaryIndex is available when >>>>>>>>>> printing the Module in the assembly writer. I.e. we could do the following >>>>>>>>>> for “main” from the above example when printing the IR definition (note the >>>>>>>>>> “^3” at the end):define dso_local i32 @main() #0 !dbg !17 ^3 {For CFI data >>>>>>>>>> structures, the format would be similar. It appears that TypeIds are >>>>>>>>>> referred to by string name in the top level TypeIdMap (std::map indexed by >>>>>>>>>> std::string type identifier), whereas they are referenced by GUID within >>>>>>>>>> the FunctionSummary class (i.e. the TypeTests vector and the VFuncId >>>>>>>>>> structure). For the LLVM assembly I think there should be a top level entry >>>>>>>>>> for each TypeIdMap, which lists both the type identifier string and its >>>>>>>>>> GUID (followed by its associated information stored in the map), and the >>>>>>>>>> TypeTests/VFuncId references on the FunctionSummary entries can reference >>>>>>>>>> it by summary slot number. I.e. something like:^1 = typeid: {guid: 12345, >>>>>>>>>> identifier: name_of_type, …^2 = gv: {... {function: {.... typeTests: {^1, >>>>>>>>>> …Peter - is that correct and does that sound ok?* >>>>>>>>>> >>>>>>>>> >>>>>>>>> I don't think that would work because the purpose of the top-level >>>>>>>>> TypeIdMap is to contain resolutions for each type identifier, and >>>>>>>>> per-module summaries do not contain resolutions (only the combined summary >>>>>>>>> does). What that means in practice is that we would not be able to recover >>>>>>>>> and write out a type identifier name for per-module summaries as part of ^1 >>>>>>>>> in your example (well, we could in principle, because the name is stored >>>>>>>>> somewhere in the function's IR, but that could get complicated). >>>>>>>>> >>>>>>>> >>>>>>>> Ah ok. I guess the top-level map then is generated by the regular >>>>>>>> LTO portion of the link (since it presumably requires IR during the Thin >>>>>>>> Link to get into the combined summary)? >>>>>>>> >>>>>>> >>>>>>> Yes, we fill in the map during the LowerTypeTests and >>>>>>> WholeProgramDevirt passes in the regular LTO part of the link, e.g. here: >>>>>>> http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/LowerTypeTests.cpp#823 >>>>>>> >>>>>>> >>>>>>>> Probably the easiest thing to do is to keep the type identifiers as >>>>>>>>> GUIDs in the function summaries and write out the mapping of type >>>>>>>>> identifiers as a top-level entity. >>>>>>>>> >>>>>>>> >>>>>>>> To confirm, you mean during the compile step create a top-level >>>>>>>> entity that maps GUID -> identifier? >>>>>>>> >>>>>>> >>>>>>> I mean that you could represent this with something like: >>>>>>> >>>>>>> ^typeids = {^1, ^2, ^3} >>>>>>> ^1 = typeid: {identifier: typeid1, ...} >>>>>>> ^2 = typeid: {identifier: typeid2, ...} >>>>>>> ^3 = typeid: {identifier: typeid3, ...} >>>>>>> >>>>>>> There's no need to store the GUIDs here because they can be computed >>>>>>> from the type identifiers. The GUIDs would only be stored in the typeTests >>>>>>> (etc.) fields in each function summary. >>>>>>> >>>>>> >>>>>> I suppose we don't need to store the GUIDs at the top level in the >>>>>> in-memory summary. But I think it would be good to emit the GUIDs in the >>>>>> typeid assembly entries because it makes the association in the assembly >>>>>> much more obvious. I.e. going back to my original example: >>>>>> >>>>>> ^1 = typeid: {guid: 12345, identifier: name_of_type, … >>>>>> ^2 = gv: {... {function: {.... typeTests: {^1, … >>>>>> >>>>>> If we didn't include the GUID in the typeid entry, but rather just >>>>>> the identifier, and put the GUID in the typeTest list in the GV's entry, it >>>>>> wouldn't be obvious at all from the assembly listing which typeid goes with >>>>>> which typeTest. It's also less compact to include the GUID in each >>>>>> typeTests list. >>>>>> >>>>> >>>>> I get that, but my point was that in a per-module summary the >>>>> TypeIdMap is empty, so there will be no names, only GUIDs. >>>>> >>>>> For "making the association more obvious" we might just want to have >>>>> the assembly writer emit the GUID of a name as a comment. >>>>> >>>>> >>>>>> Or perhaps we are saying the same thing - I can't tell from your >>>>>> above example if the GUID is also emitted in the "typeid:" entries. >>>>>> >>>>> >>>>> No, it wouldn't be. >>>>> >>>>> I'm not sure there is a need for the: >>>>>> ^typeids = {^1, ^2, ^3} >>>>>> We can just build the typeids list on the fly as " = typeid: " >>>>>> entries are read in. >>>>>> >>>>> >>>>> That's true. Given that nothing actually needs to refer to them, we >>>>> can just represent the typeids as something like >>>>> typeid: {identifier: typeid1, ...} ; guid = 123 >>>>> typeid: {identifier: typeid2, ...} ; guid = 456 >>>>> typeid: {identifier: typeid3, ...} ; guid = 789 >>>>> without an associated number. >>>>> >>>>> Peter >>>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>> Teresa >>>>>> >>>>>> >>>>>> >>>>>>> Peter >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Teresa >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Peter >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *Issues when Parsing of Summaries from >>>>>>>>>> Assembly--------------------------------------------------------------------When >>>>>>>>>> reading an LLVM assembly file containing module summary entries, a >>>>>>>>>> ModuleSummaryIndex will be created from the entries.Things to consider are >>>>>>>>>> the behavior when: - Invoked with “opt -module-summary” (which currently >>>>>>>>>> builds a new summary index from the IR). Options:1. recompute summary and >>>>>>>>>> throw away summary in the assembly file2. ignore -module-summary and build >>>>>>>>>> the summary from the LLVM assembly3. give an error4. compare the two >>>>>>>>>> summaries (one created from the assembly and the new one created by the >>>>>>>>>> analysis phase from the IR), and error if they are different.My opinion is >>>>>>>>>> to do a), so that the behavior using -module-summary doesn’t change. We >>>>>>>>>> also need a way to force building of a fresh module summary for cases where >>>>>>>>>> the user has modified the LLVM assembly of the IR (see below). - How to >>>>>>>>>> handle older LLVM assembly files that don’t contain new summary fields. >>>>>>>>>> Options:1. Force the LLVM assembly file to be recreated with a new summary. >>>>>>>>>> I.e. “opt -module-summary -o - | llvm-dis”.2. Auto-upgrade, by silently >>>>>>>>>> creating conservative values for the new summary entries.I lean towards b) >>>>>>>>>> (when possible) for user-friendliness and to reduce required churn on test >>>>>>>>>> inputs. - How to handle partial or incorrect LLVM assembly summary entries. >>>>>>>>>> How to handle partial summaries depends in part on how we answer the prior >>>>>>>>>> question about auto-upgrading. I think the best option like there is to >>>>>>>>>> handle it automatically when possible. However, I do think we should error >>>>>>>>>> on glaring errors like obviously missing information. For example, when >>>>>>>>>> there is summary data in the LLVM assembly, but summary entries are missing >>>>>>>>>> for some global values. E.g. if the user modified the assembly to add a >>>>>>>>>> function but forgot to add a corresponding summary entry. We could still >>>>>>>>>> have subtle issues (e.g. user adds a new call but forgets to update the >>>>>>>>>> caller’s summary call list), but it will be harder to detect those.* >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>>>>>> 408-460-2413 <(408)%20460-2413> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> -- >>>>>>>>> Peter >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>>>> 408-460-2413 <(408)%20460-2413> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> -- >>>>>>> Peter >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>> 408-460-2413 <(408)%20460-2413> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> -- >>>>> Peter >>>>> >>>> >>> >>> >>> -- >>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>> 408-460-2413 <(408)%20460-2413> >>> >> > > > -- > -- > Peter > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180503/90df7874/attachment-0001.html>