David Blaikie via llvm-dev
2018-May-03 22:58 UTC
[llvm-dev] RFC: LLVM Assembly format for ThinLTO Summary
On Thu, May 3, 2018 at 3:52 PM Peter Collingbourne <peter at pcc.me.uk> wrote:> On Thu, May 3, 2018 at 3:29 PM, David Blaikie <dblaikie at gmail.com> wrote: > >> >> >> On Thu, May 3, 2018 at 3:08 PM Peter Collingbourne via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> On Thu, May 3, 2018 at 2:44 PM, Mehdi AMINI <joker.eph at gmail.com> wrote: >>> >>>> >>>> >>>> Le mar. 1 mai 2018 à 16:50, Teresa Johnson <tejohnson at google.com> a >>>> écrit : >>>> >>>>> Hi Mehdi, thanks for the comments, responses and a tweaked proposal >>>>> below. Teresa >>>>> >>>>> On Tue, May 1, 2018 at 11:37 AM, Mehdi AMINI <joker.eph at gmail.com> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> My main concern is this one: >>>>>> >>>>>> > Currently, I am emitting the summary entries at the end, after the >>>>>> metadata nodes. Note that the ModuleSummaryIndex is not currently >>>>>> referenced from the Module, and isn’t currently created when parsing the >>>>>> Module IR bitcode (there is a separate derived class for reading the >>>>>> ModuleSummaryIndex from bitcode). This is because they are not currently >>>>>> used at the same time. However, in the future there is no reason why we >>>>>> couldn’t tag the global values in the Module’s LLVM assembly with the >>>>>> corresponding summary entry if the ModuleSummaryIndex is available when >>>>>> printing the Module in the assembly writer. I.e. we could do the following >>>>>> for “main” from the above example when printing the IR definition (note the >>>>>> “^3” at the end): >>>>>> >>>>>> >>>>>> I believe the reason that the ModuleSummaryIndex is not attached to >>>>>> the Module is that it is fundamentally not a piece of IR, but it is >>>>>> conceptually really an Analysis result. >>>>>> Usually other analyses don't serialize their result, we happen to >>>>>> serialize this one for an optimization purpose (reloading it and making the >>>>>> thin-link faster). >>>>>> >>>>> >>>>> True. My understanding is that the push for having it serialized via >>>>> assembly is due to the fact that it is emitted into the bitcode. I know >>>>> there is disagreement on this reasoning, I am hoping to have a proposal >>>>> that is acceptable to everyone. =) >>>>> >>>>> >>>>> >>>>>> The fundamental problem is that an analysis result has to be able to >>>>>> be invalidated with IR changes, attaching this directly to the module >>>>>> wouldn't achieve this. The risk is that when the IR and the summary get >>>>>> out-of-sync (`clang -O2 my_module_with_summaries.ll -emit-llvm -o >>>>>> my_optimized module_with_summaries.ll`) the summaries would be badly wrong. >>>>>> >>>>>> Have you looked into what it'd take to make it a "real" analysis in >>>>>> the pass manager? >>>>>> >>>>> >>>>> Thanks for raising this issue specifically, I hadn't addressed it in >>>>> my proposal and it is a big one. I am not proposing that we attempt to >>>>> maintain the summary through optimization passes, and definitely don't >>>>> think we should do that. IMO deserializing it should be for testing the >>>>> thin link and the combined summaries in the backends only. To that end, I >>>>> have an idea (below some background first). >>>>> >>>>> Note that in some cases the module summary analysis is an analysis >>>>> pass. I.e. when invoked by "opt -module-summary=". However, some time ago >>>>> when Peter added the support for splitting the bitcode (for CFI purposes) >>>>> and therefore needed to generate a summary in each partition (Thin and >>>>> Regular), he added the ThinLTOBitcodeWriterPass, which invokes the module >>>>> summary builder directly (twice). This writer is what gets invoked now when >>>>> building via "clang -flto=thin", and with "opt -thinlto-bc". So there it is >>>>> not invoked/maintained as an analysis pass/result. It would be tricky to >>>>> figure out how to even split rather than recompute the module summary index >>>>> in that case. Even in the case where we are still invoking as an analysis >>>>> pass (opt -module-summary), we would need to figure out how to read in the >>>>> module summary to use as the analysis result when available (so that it >>>>> could be invalidated and recomputed when stale). >>>>> >>>>> Rather than add this plumbing, and just have it discarded if opt does >>>>> any optimization, I think we should focus at least for the time being on >>>>> supporting reading the summary from assembly exactly where we currently >>>>> read in the summary from bitcode: >>>>> 1) For the thin link (e.g. tools such as llvm-lto2 or llvm-lto, which >>>>> currently have to be preceded by "opt -module-summary/-thinlto-bc" to >>>>> generate an index, but could just build it from assembly instead). >>>>> 2) For the LTO backends (e.g. tools such as llvm-lto which can consume >>>>> a combined index and invoke the backends, or "clang -fthinlto-index=" for >>>>> distributed ThinLTO backend testing), where we could build the combined >>>>> summary index from assembly instead. >>>>> >>>>> This greatly simplifies the reading side, as there are no >>>>> optimizations performed on the IR after the index is read in these cases >>>>> that would require invalidation. It also simplifies adding the parsing >>>>> support, since it gets invoked exactly where we expect to build an index >>>>> currently (i.e. since we don't currently build or store the >>>>> ModuleSummaryIndex when parsing the Module from bitcode). It doesn't >>>>> preclude someone from figuring out how to compute the module summary >>>>> analysis result from the assembly, and invalidating it after optimization, >>>>> when reading the Module IR via 'opt' in the future. >>>>> >>>>> Does this seem like a reasonable proposal to everyone? >>>>> >>>> >>>> Sounds good to me. >>>> >>>> That would make .ll files quite convenient during debugging I think? We >>>> could disassemble, manually change summaries, and re-assemble a bitcode >>>> file before running the (thin-)link again. >>>> >>> >>> Yes, that seems more reasonable than what I thought you had in mind. If >>> the only consumer of this information is llvm-as, then the purpose of the >>> asm summary format is just to provide a way to create a .bc file for >>> testing purposes, which is certainly a useful capability. >>> >> >> That feels a bit surprising if ".ll -> llvm-as -> .bc -> <some other tool >> - opt, etc>" is different from ".ll -> <opt, etc>". Is that what we're >> talking about here? Any chance that can be avoided & feeding a .ll file >> works (in the sense of does the same thing/tests the same behavior) in all >> the same places that feeding a .bc file does? (as is the case with >> non-summary-based IR, to the best of my knowledge) >> > > I may be mistaken, but I don't think we have a lot of tools that can read > both .ll and .bc and end up using the summary if it is a .bc file. LTO > can't read .ll, for example. >Oh, I didn't know that - why doesn't it read .ll? That seems like an oversight - most/all the other LLVM tools treat .ll and .bc pretty equally, don't they?> The only one that I can think of is clang and presumably we could make > that use whichever API we would use in llvm-as for reading the summary from > .ll. So the behaviour of most tools would be that ".ll -> llvm-as -> .bc -> > <some tool>" vs ".ll -> <some tool>" would end up being the same in both > cases: the summary gets discarded. > > Peter > > >> >>> >>> Peter >>> >>> >>>> Thanks, >>>> >>>> -- >>>> Mehdi >>>> >>>> >>>> >>>> >>>> >>>>> >>>>> Thanks, >>>>> Teresa >>>>> >>>>> >>>>> >>>>>> Cheers, >>>>>> >>>>>> -- >>>>>> Mehdi >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Le mar. 1 mai 2018 à 11:10, Peter Collingbourne <peter at pcc.me.uk> a >>>>>> écrit : >>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, May 1, 2018 at 9:00 AM, Teresa Johnson <tejohnson at google.com >>>>>>> > wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Apr 30, 2018 at 10:21 AM Peter Collingbourne < >>>>>>>> peter at pcc.me.uk> wrote: >>>>>>>> >>>>>>>>> On Mon, Apr 30, 2018 at 8:32 AM, Teresa Johnson < >>>>>>>>> tejohnson at google.com> wrote: >>>>>>>>> >>>>>>>>>> Hi Peter, >>>>>>>>>> Thanks for your comments, replies below. >>>>>>>>>> Teresa >>>>>>>>>> >>>>>>>>>> On Wed, Apr 25, 2018 at 2:08 PM Peter Collingbourne < >>>>>>>>>> peter at pcc.me.uk> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Teresa, >>>>>>>>>>> >>>>>>>>>>> Thanks for sending this proposal out. >>>>>>>>>>> >>>>>>>>>>> I would again like to register my disagreement with the whole >>>>>>>>>>> idea of writing summaries in LLVM assembly format. In my view it is clear >>>>>>>>>>> that this is not the right direction, as it only invites additional >>>>>>>>>>> complexity and more ways for things to go wrong for no real benefit. >>>>>>>>>>> However, I don't have the energy to argue that point any further, so I >>>>>>>>>>> won't stand in the way here. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I assume you are most concerned with the >>>>>>>>>> re-assembly/deserialization of the summary. My main goal is to get this >>>>>>>>>> dumped into a text format, and I went this route since the last dumper RFC >>>>>>>>>> was blocked with the LLVM assembly direction pushed. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Yes, my main concern is with the deserialization. My view is that >>>>>>>>> it should only be allowed for combined summaries -- allowing it for >>>>>>>>> per-module is unnecessary as it creates the possibility of things gettting >>>>>>>>> out of sync. Given that, we don't actually need an assembly representation >>>>>>>>> and we can use whichever format is most convenient. But given the >>>>>>>>> opposition to this viewpoint I am willing to concede getting the format >>>>>>>>> (IMHO) right in favour of something that is acceptable to others, just so >>>>>>>>> that we can test things in a more reasonable way. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Tue, Apr 24, 2018 at 7:43 AM, Teresa Johnson < >>>>>>>>>>> tejohnson at google.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi everyone, >>>>>>>>>>>> >>>>>>>>>>>> I started working on a long-standing request to have the >>>>>>>>>>>> summary dumped in a readable format to text, and specifically to emit to >>>>>>>>>>>> LLVM assembly. Proposal below, please let me know your thoughts. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Teresa >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *RFC: LLVM Assembly format for ThinLTO >>>>>>>>>>>> Summary========================================Background-----------------ThinLTO >>>>>>>>>>>> operates on small summaries computed during the compile step (i.e. with “-c >>>>>>>>>>>> -flto=thin”), which are then analyzed and updated during the Thin Link >>>>>>>>>>>> stage, and utilized to perform IR updates during the post-link ThinLTO >>>>>>>>>>>> backends. The summaries are emitted as LLVM Bitcode, however, not currently >>>>>>>>>>>> in the LLVM assembly.There are two ways to generate a bitcode file >>>>>>>>>>>> containing summary records for a module: 1. Compile with “clang -c >>>>>>>>>>>> -flto=thin”2. Build from LLVM assembly using “opt -module-summary”Either of >>>>>>>>>>>> these will result in the ModuleSummaryIndex analysis pass (which builds the >>>>>>>>>>>> summary index in memory for a module) to be added to the pipeline just >>>>>>>>>>>> before bitcode emission.Additionally, a combined index is created by >>>>>>>>>>>> merging all the per-module indexes during the Thin Link, which is >>>>>>>>>>>> optionally emitted as a bitcode file.Currently, the only way to view these >>>>>>>>>>>> records is via “llvm-bcanalyzer -dump”, then manually decoding the raw >>>>>>>>>>>> bitcode dumps.Relatedly, there is YAML reader/writer support for CFI >>>>>>>>>>>> related summary fields (-wholeprogramdevirt-read-summary and >>>>>>>>>>>> -wholeprogramdevirt-write-summary). Last summer, GSOC student Charles >>>>>>>>>>>> Saternos implemented support to dump the summary in YAML from llvm-lto2 >>>>>>>>>>>> (D34080), including the rest of the summary fields (D34063), however, there >>>>>>>>>>>> was pushback on the related RFC for dumping via YAML or another format >>>>>>>>>>>> rather than emitting as LLVM assembly.Goals: 1. Define LLVM assembly format >>>>>>>>>>>> for summary index2. Define interaction between parsing of summary from LLVM >>>>>>>>>>>> assembly and synthesis of new summary index from IR.3. Implement printing >>>>>>>>>>>> and parsing of summary index LLVM assemblyProposed LLVM Assembly >>>>>>>>>>>> Format----------------------------------------------There are several top >>>>>>>>>>>> level data structures within the ModuleSummaryIndex: 1. >>>>>>>>>>>> ModulePathStringTable: Holds the paths to the modules summarized in the >>>>>>>>>>>> index (only one entry for per-module indexes and multiple in the combined >>>>>>>>>>>> index), along with their hashes (for incremental builds and global >>>>>>>>>>>> promotion).2. GlobalValueMap: A map from global value GUIDs to the >>>>>>>>>>>> corresponding function/variable/alias summary (or summaries for the >>>>>>>>>>>> combined index and weak linkage).3. CFI-related data structures (TypeIdMap, >>>>>>>>>>>> CfiFunctionDefs, and CfiFunctionDecls)I have a WIP patch to AsmWriter.cpp >>>>>>>>>>>> to print the ModuleSummaryIndex that I was using to play with the format. >>>>>>>>>>>> It currently prints 1 and 2 above. I’ve left the CFI related summary data >>>>>>>>>>>> structures as a TODO for now, until the format is at least conceptually >>>>>>>>>>>> agreed, but from looking at those I don’t see an issue with using the same >>>>>>>>>>>> format (with a note/question for Peter on CFI type test representation >>>>>>>>>>>> below).I modeled the proposed format on metadata, with a few key >>>>>>>>>>>> differences noted below. Like metadata, I propose enumerating the entries >>>>>>>>>>>> with the SlotTracker, and prefixing them with a special character. Avoiding >>>>>>>>>>>> characters already used in some fashion (i.e. “!” for metadata and “#” for >>>>>>>>>>>> attributes), I initially have chosen “^”. Open to suggestions >>>>>>>>>>>> though.Consider the following example:extern void foo();int X;int bar() { >>>>>>>>>>>> foo(); return X;}void barAlias() __attribute__ ((alias ("bar")));int >>>>>>>>>>>> main() { barAlias(); return bar();}The proposed format has one entry per >>>>>>>>>>>> ModulePathStringTable entry and one per GlobalValueMap/GUID, and looks >>>>>>>>>>>> like:^0 = module: {path: testA.o, hash: 5487197307045666224}^1 = gv: {guid: >>>>>>>>>>>> 1881667236089500162, name: X, summaries: {variable: {module: ^0, flags: >>>>>>>>>>>> {linkage: common, notEligibleToImport: 0, live: 0, dsoLocal: 1}}}}^2 = gv: >>>>>>>>>>>> {guid: 6699318081062747564, name: foo}^3 = gv: {guid: 15822663052811949562, >>>>>>>>>>>> name: main, summaries: {function: {module: ^0, flags: {linkage: extern, >>>>>>>>>>>> notEligibleToImport: 1, live: 0, dsoLocal: 1}, insts: 5, funcFlags: >>>>>>>>>>>> {readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0}, calls: >>>>>>>>>>>> {{callee: ^5, hotness: unknown}, {callee: ^4, hotness: unknown}}}}}^4 = gv: >>>>>>>>>>>> {guid: 16434608426314478903, name: bar, summaries: {function: {module: ^0, >>>>>>>>>>>> flags: {linkage: extern, notEligibleToImport: 1, live: 0, dsoLocal: 1}, >>>>>>>>>>>> insts: 3, funcFlags: {readNone: 0, readOnly: 0, noRecurse: 0, >>>>>>>>>>>> returnDoesNotAlias: 0}, calls: {{callee: ^2, hotness: unknown}}, refs: >>>>>>>>>>>> {^1}}}}^5 = gv: {guid: 18040127437030252312, name: barAlias, summaries: >>>>>>>>>>>> {alias: {module: ^0, flags: {linkage: extern, notEligibleToImport: 0, live: >>>>>>>>>>>> 0, dsoLocal: 1}, aliasee: ^4}}}Like metadata, the fields are tagged >>>>>>>>>>>> (currently using lower camel case, maybe upper camel case would be >>>>>>>>>>>> preferable).The proposed format has a structure that reflects the data >>>>>>>>>>>> structures in the summary index. For example, consider the entry “^4”. This >>>>>>>>>>>> corresponds to the function “bar”. The entry for that GUID in the >>>>>>>>>>>> GlobalValueMap contains a list of summaries. For per-module summaries such >>>>>>>>>>>> as this, there will be at most one summary (with no summary list for an >>>>>>>>>>>> external function like “foo”). In the combined summary there may be >>>>>>>>>>>> multiple, e.g. in the case of linkonce_odr functions which have definitions >>>>>>>>>>>> in multiple modules. The summary list for bar (“^4”) contains a >>>>>>>>>>>> FunctionSummary, so the summary is tagged “function:”. The FunctionSummary >>>>>>>>>>>> contains both a flags structure (inherited from the base GlobalValueSummary >>>>>>>>>>>> class), and a funcFlags structure (specific to FunctionSummary). It >>>>>>>>>>>> therefore contains a brace-enclosed list of flag tags/values for each.Where >>>>>>>>>>>> a global value summary references another global value summary (e.g. via a >>>>>>>>>>>> call list, reference list, or aliasee), the entry is referenced by its >>>>>>>>>>>> slot. E.g. the alias “barAlias” (“^5”) references its aliasee “bar” as >>>>>>>>>>>> “^4”.Note that in comparison metadata assembly entries tend to be much more >>>>>>>>>>>> decomposed since many metadata fields are themselves metadata (so then >>>>>>>>>>>> entries tend to be shorter with references to other metadata >>>>>>>>>>>> nodes).Currently, I am emitting the summary entries at the end, after the >>>>>>>>>>>> metadata nodes. Note that the ModuleSummaryIndex is not currently >>>>>>>>>>>> referenced from the Module, and isn’t currently created when parsing the >>>>>>>>>>>> Module IR bitcode (there is a separate derived class for reading the >>>>>>>>>>>> ModuleSummaryIndex from bitcode). This is because they are not currently >>>>>>>>>>>> used at the same time. However, in the future there is no reason why we >>>>>>>>>>>> couldn’t tag the global values in the Module’s LLVM assembly with the >>>>>>>>>>>> corresponding summary entry if the ModuleSummaryIndex is available when >>>>>>>>>>>> printing the Module in the assembly writer. I.e. we could do the following >>>>>>>>>>>> for “main” from the above example when printing the IR definition (note the >>>>>>>>>>>> “^3” at the end):define dso_local i32 @main() #0 !dbg !17 ^3 {For CFI data >>>>>>>>>>>> structures, the format would be similar. It appears that TypeIds are >>>>>>>>>>>> referred to by string name in the top level TypeIdMap (std::map indexed by >>>>>>>>>>>> std::string type identifier), whereas they are referenced by GUID within >>>>>>>>>>>> the FunctionSummary class (i.e. the TypeTests vector and the VFuncId >>>>>>>>>>>> structure). For the LLVM assembly I think there should be a top level entry >>>>>>>>>>>> for each TypeIdMap, which lists both the type identifier string and its >>>>>>>>>>>> GUID (followed by its associated information stored in the map), and the >>>>>>>>>>>> TypeTests/VFuncId references on the FunctionSummary entries can reference >>>>>>>>>>>> it by summary slot number. I.e. something like:^1 = typeid: {guid: 12345, >>>>>>>>>>>> identifier: name_of_type, …^2 = gv: {... {function: {.... typeTests: {^1, >>>>>>>>>>>> …Peter - is that correct and does that sound ok?* >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I don't think that would work because the purpose of the >>>>>>>>>>> top-level TypeIdMap is to contain resolutions for each type identifier, and >>>>>>>>>>> per-module summaries do not contain resolutions (only the combined summary >>>>>>>>>>> does). What that means in practice is that we would not be able to recover >>>>>>>>>>> and write out a type identifier name for per-module summaries as part of ^1 >>>>>>>>>>> in your example (well, we could in principle, because the name is stored >>>>>>>>>>> somewhere in the function's IR, but that could get complicated). >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Ah ok. I guess the top-level map then is generated by the regular >>>>>>>>>> LTO portion of the link (since it presumably requires IR during the Thin >>>>>>>>>> Link to get into the combined summary)? >>>>>>>>>> >>>>>>>>> >>>>>>>>> Yes, we fill in the map during the LowerTypeTests and >>>>>>>>> WholeProgramDevirt passes in the regular LTO part of the link, e.g. here: >>>>>>>>> >>>>>>>>> http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/LowerTypeTests.cpp#823 >>>>>>>>> >>>>>>>>> >>>>>>>>>> Probably the easiest thing to do is to keep the type identifiers >>>>>>>>>>> as GUIDs in the function summaries and write out the mapping of type >>>>>>>>>>> identifiers as a top-level entity. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> To confirm, you mean during the compile step create a top-level >>>>>>>>>> entity that maps GUID -> identifier? >>>>>>>>>> >>>>>>>>> >>>>>>>>> I mean that you could represent this with something like: >>>>>>>>> >>>>>>>>> ^typeids = {^1, ^2, ^3} >>>>>>>>> ^1 = typeid: {identifier: typeid1, ...} >>>>>>>>> ^2 = typeid: {identifier: typeid2, ...} >>>>>>>>> ^3 = typeid: {identifier: typeid3, ...} >>>>>>>>> >>>>>>>>> There's no need to store the GUIDs here because they can be >>>>>>>>> computed from the type identifiers. The GUIDs would only be stored in the >>>>>>>>> typeTests (etc.) fields in each function summary. >>>>>>>>> >>>>>>>> >>>>>>>> I suppose we don't need to store the GUIDs at the top level in the >>>>>>>> in-memory summary. But I think it would be good to emit the GUIDs in the >>>>>>>> typeid assembly entries because it makes the association in the assembly >>>>>>>> much more obvious. I.e. going back to my original example: >>>>>>>> >>>>>>>> ^1 = typeid: {guid: 12345, identifier: name_of_type, … >>>>>>>> ^2 = gv: {... {function: {.... typeTests: {^1, … >>>>>>>> >>>>>>>> If we didn't include the GUID in the typeid entry, but rather just >>>>>>>> the identifier, and put the GUID in the typeTest list in the GV's entry, it >>>>>>>> wouldn't be obvious at all from the assembly listing which typeid goes with >>>>>>>> which typeTest. It's also less compact to include the GUID in each >>>>>>>> typeTests list. >>>>>>>> >>>>>>> >>>>>>> I get that, but my point was that in a per-module summary the >>>>>>> TypeIdMap is empty, so there will be no names, only GUIDs. >>>>>>> >>>>>>> For "making the association more obvious" we might just want to have >>>>>>> the assembly writer emit the GUID of a name as a comment. >>>>>>> >>>>>>> >>>>>>>> Or perhaps we are saying the same thing - I can't tell from your >>>>>>>> above example if the GUID is also emitted in the "typeid:" entries. >>>>>>>> >>>>>>> >>>>>>> No, it wouldn't be. >>>>>>> >>>>>>> I'm not sure there is a need for the: >>>>>>>> ^typeids = {^1, ^2, ^3} >>>>>>>> We can just build the typeids list on the fly as " = typeid: " >>>>>>>> entries are read in. >>>>>>>> >>>>>>> >>>>>>> That's true. Given that nothing actually needs to refer to them, we >>>>>>> can just represent the typeids as something like >>>>>>> typeid: {identifier: typeid1, ...} ; guid = 123 >>>>>>> typeid: {identifier: typeid2, ...} ; guid = 456 >>>>>>> typeid: {identifier: typeid3, ...} ; guid = 789 >>>>>>> without an associated number. >>>>>>> >>>>>>> Peter >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Teresa >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Peter >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Teresa >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Peter >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *Issues when Parsing of Summaries from >>>>>>>>>>>> Assembly--------------------------------------------------------------------When >>>>>>>>>>>> reading an LLVM assembly file containing module summary entries, a >>>>>>>>>>>> ModuleSummaryIndex will be created from the entries.Things to consider are >>>>>>>>>>>> the behavior when: - Invoked with “opt -module-summary” (which currently >>>>>>>>>>>> builds a new summary index from the IR). Options:1. recompute summary and >>>>>>>>>>>> throw away summary in the assembly file2. ignore -module-summary and build >>>>>>>>>>>> the summary from the LLVM assembly3. give an error4. compare the two >>>>>>>>>>>> summaries (one created from the assembly and the new one created by the >>>>>>>>>>>> analysis phase from the IR), and error if they are different.My opinion is >>>>>>>>>>>> to do a), so that the behavior using -module-summary doesn’t change. We >>>>>>>>>>>> also need a way to force building of a fresh module summary for cases where >>>>>>>>>>>> the user has modified the LLVM assembly of the IR (see below). - How to >>>>>>>>>>>> handle older LLVM assembly files that don’t contain new summary fields. >>>>>>>>>>>> Options:1. Force the LLVM assembly file to be recreated with a new summary. >>>>>>>>>>>> I.e. “opt -module-summary -o - | llvm-dis”.2. Auto-upgrade, by silently >>>>>>>>>>>> creating conservative values for the new summary entries.I lean towards b) >>>>>>>>>>>> (when possible) for user-friendliness and to reduce required churn on test >>>>>>>>>>>> inputs. - How to handle partial or incorrect LLVM assembly summary entries. >>>>>>>>>>>> How to handle partial summaries depends in part on how we answer the prior >>>>>>>>>>>> question about auto-upgrading. I think the best option like there is to >>>>>>>>>>>> handle it automatically when possible. However, I do think we should error >>>>>>>>>>>> on glaring errors like obviously missing information. For example, when >>>>>>>>>>>> there is summary data in the LLVM assembly, but summary entries are missing >>>>>>>>>>>> for some global values. E.g. if the user modified the assembly to add a >>>>>>>>>>>> function but forgot to add a corresponding summary entry. We could still >>>>>>>>>>>> have subtle issues (e.g. user adds a new call but forgets to update the >>>>>>>>>>>> caller’s summary call list), but it will be harder to detect those.* >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>>>>>>>> 408-460-2413 <(408)%20460-2413> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> -- >>>>>>>>>>> Peter >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>>>>>> 408-460-2413 <(408)%20460-2413> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> -- >>>>>>>>> Peter >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>>>> 408-460-2413 <(408)%20460-2413> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> -- >>>>>>> Peter >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>> 408-460-2413 <(408)%20460-2413> >>>>> >>>> >>> >>> >>> -- >>> -- >>> Peter >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> > > > -- > -- > Peter >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180503/d5e7adf8/attachment-0001.html>
Peter Collingbourne via llvm-dev
2018-May-03 23:09 UTC
[llvm-dev] RFC: LLVM Assembly format for ThinLTO Summary
On Thu, May 3, 2018 at 3:58 PM, David Blaikie <dblaikie at gmail.com> wrote:> > > On Thu, May 3, 2018 at 3:52 PM Peter Collingbourne <peter at pcc.me.uk> > wrote: > >> On Thu, May 3, 2018 at 3:29 PM, David Blaikie <dblaikie at gmail.com> wrote: >> >>> >>> >>> On Thu, May 3, 2018 at 3:08 PM Peter Collingbourne via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> On Thu, May 3, 2018 at 2:44 PM, Mehdi AMINI <joker.eph at gmail.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> Le mar. 1 mai 2018 à 16:50, Teresa Johnson <tejohnson at google.com> a >>>>> écrit : >>>>> >>>>>> Hi Mehdi, thanks for the comments, responses and a tweaked proposal >>>>>> below. Teresa >>>>>> >>>>>> On Tue, May 1, 2018 at 11:37 AM, Mehdi AMINI <joker.eph at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> My main concern is this one: >>>>>>> >>>>>>> > Currently, I am emitting the summary entries at the end, after the >>>>>>> metadata nodes. Note that the ModuleSummaryIndex is not currently >>>>>>> referenced from the Module, and isn’t currently created when parsing the >>>>>>> Module IR bitcode (there is a separate derived class for reading the >>>>>>> ModuleSummaryIndex from bitcode). This is because they are not currently >>>>>>> used at the same time. However, in the future there is no reason why we >>>>>>> couldn’t tag the global values in the Module’s LLVM assembly with the >>>>>>> corresponding summary entry if the ModuleSummaryIndex is available when >>>>>>> printing the Module in the assembly writer. I.e. we could do the following >>>>>>> for “main” from the above example when printing the IR definition (note the >>>>>>> “^3” at the end): >>>>>>> >>>>>>> >>>>>>> I believe the reason that the ModuleSummaryIndex is not attached to >>>>>>> the Module is that it is fundamentally not a piece of IR, but it is >>>>>>> conceptually really an Analysis result. >>>>>>> Usually other analyses don't serialize their result, we happen to >>>>>>> serialize this one for an optimization purpose (reloading it and making the >>>>>>> thin-link faster). >>>>>>> >>>>>> >>>>>> True. My understanding is that the push for having it serialized via >>>>>> assembly is due to the fact that it is emitted into the bitcode. I know >>>>>> there is disagreement on this reasoning, I am hoping to have a proposal >>>>>> that is acceptable to everyone. =) >>>>>> >>>>>> >>>>>> >>>>>>> The fundamental problem is that an analysis result has to be able to >>>>>>> be invalidated with IR changes, attaching this directly to the module >>>>>>> wouldn't achieve this. The risk is that when the IR and the summary get >>>>>>> out-of-sync (`clang -O2 my_module_with_summaries.ll -emit-llvm -o >>>>>>> my_optimized module_with_summaries.ll`) the summaries would be badly wrong. >>>>>>> >>>>>>> Have you looked into what it'd take to make it a "real" analysis in >>>>>>> the pass manager? >>>>>>> >>>>>> >>>>>> Thanks for raising this issue specifically, I hadn't addressed it in >>>>>> my proposal and it is a big one. I am not proposing that we attempt to >>>>>> maintain the summary through optimization passes, and definitely don't >>>>>> think we should do that. IMO deserializing it should be for testing the >>>>>> thin link and the combined summaries in the backends only. To that end, I >>>>>> have an idea (below some background first). >>>>>> >>>>>> Note that in some cases the module summary analysis is an analysis >>>>>> pass. I.e. when invoked by "opt -module-summary=". However, some time ago >>>>>> when Peter added the support for splitting the bitcode (for CFI purposes) >>>>>> and therefore needed to generate a summary in each partition (Thin and >>>>>> Regular), he added the ThinLTOBitcodeWriterPass, which invokes the module >>>>>> summary builder directly (twice). This writer is what gets invoked now when >>>>>> building via "clang -flto=thin", and with "opt -thinlto-bc". So there it is >>>>>> not invoked/maintained as an analysis pass/result. It would be tricky to >>>>>> figure out how to even split rather than recompute the module summary index >>>>>> in that case. Even in the case where we are still invoking as an analysis >>>>>> pass (opt -module-summary), we would need to figure out how to read in the >>>>>> module summary to use as the analysis result when available (so that it >>>>>> could be invalidated and recomputed when stale). >>>>>> >>>>>> Rather than add this plumbing, and just have it discarded if opt does >>>>>> any optimization, I think we should focus at least for the time being on >>>>>> supporting reading the summary from assembly exactly where we currently >>>>>> read in the summary from bitcode: >>>>>> 1) For the thin link (e.g. tools such as llvm-lto2 or llvm-lto, which >>>>>> currently have to be preceded by "opt -module-summary/-thinlto-bc" to >>>>>> generate an index, but could just build it from assembly instead). >>>>>> 2) For the LTO backends (e.g. tools such as llvm-lto which can >>>>>> consume a combined index and invoke the backends, or "clang >>>>>> -fthinlto-index=" for distributed ThinLTO backend testing), where we could >>>>>> build the combined summary index from assembly instead. >>>>>> >>>>>> This greatly simplifies the reading side, as there are no >>>>>> optimizations performed on the IR after the index is read in these cases >>>>>> that would require invalidation. It also simplifies adding the parsing >>>>>> support, since it gets invoked exactly where we expect to build an index >>>>>> currently (i.e. since we don't currently build or store the >>>>>> ModuleSummaryIndex when parsing the Module from bitcode). It doesn't >>>>>> preclude someone from figuring out how to compute the module summary >>>>>> analysis result from the assembly, and invalidating it after optimization, >>>>>> when reading the Module IR via 'opt' in the future. >>>>>> >>>>>> Does this seem like a reasonable proposal to everyone? >>>>>> >>>>> >>>>> Sounds good to me. >>>>> >>>>> That would make .ll files quite convenient during debugging I think? >>>>> We could disassemble, manually change summaries, and re-assemble a bitcode >>>>> file before running the (thin-)link again. >>>>> >>>> >>>> Yes, that seems more reasonable than what I thought you had in mind. If >>>> the only consumer of this information is llvm-as, then the purpose of the >>>> asm summary format is just to provide a way to create a .bc file for >>>> testing purposes, which is certainly a useful capability. >>>> >>> >>> That feels a bit surprising if ".ll -> llvm-as -> .bc -> <some other >>> tool - opt, etc>" is different from ".ll -> <opt, etc>". Is that what we're >>> talking about here? Any chance that can be avoided & feeding a .ll file >>> works (in the sense of does the same thing/tests the same behavior) in all >>> the same places that feeding a .bc file does? (as is the case with >>> non-summary-based IR, to the best of my knowledge) >>> >> >> I may be mistaken, but I don't think we have a lot of tools that can read >> both .ll and .bc and end up using the summary if it is a .bc file. LTO >> can't read .ll, for example. >> > > Oh, I didn't know that - why doesn't it read .ll? That seems like an > oversight - most/all the other LLVM tools treat .ll and .bc pretty equally, > don't they? >Yes, the developer-facing tools do (clang being one of them, as it has some developer-facing features), but linkers (the LTO API consumers) are user-facing, and have no need to consume .ll files. Getting them to read .ll has other complications as well. For example, Unix linkers will interpret textual input files as linker scripts, so there would need to be some mechanism to distinguish .ll files from linker scripts. llvm-lto2 (the LTO API test driver) is a developer-facing tool, and we could conceivably make it read .ll files and pass them to the API as .bc files. Peter> >> The only one that I can think of is clang and presumably we could make >> that use whichever API we would use in llvm-as for reading the summary from >> .ll. So the behaviour of most tools would be that ".ll -> llvm-as -> .bc -> >> <some tool>" vs ".ll -> <some tool>" would end up being the same in both >> cases: the summary gets discarded. >> >> Peter >> >> >>> >>>> >>>> Peter >>>> >>>> >>>>> Thanks, >>>>> >>>>> -- >>>>> Mehdi >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> Teresa >>>>>> >>>>>> >>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> -- >>>>>>> Mehdi >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Le mar. 1 mai 2018 à 11:10, Peter Collingbourne <peter at pcc.me.uk> a >>>>>>> écrit : >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, May 1, 2018 at 9:00 AM, Teresa Johnson < >>>>>>>> tejohnson at google.com> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Apr 30, 2018 at 10:21 AM Peter Collingbourne < >>>>>>>>> peter at pcc.me.uk> wrote: >>>>>>>>> >>>>>>>>>> On Mon, Apr 30, 2018 at 8:32 AM, Teresa Johnson < >>>>>>>>>> tejohnson at google.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Peter, >>>>>>>>>>> Thanks for your comments, replies below. >>>>>>>>>>> Teresa >>>>>>>>>>> >>>>>>>>>>> On Wed, Apr 25, 2018 at 2:08 PM Peter Collingbourne < >>>>>>>>>>> peter at pcc.me.uk> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Teresa, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for sending this proposal out. >>>>>>>>>>>> >>>>>>>>>>>> I would again like to register my disagreement with the whole >>>>>>>>>>>> idea of writing summaries in LLVM assembly format. In my view it is clear >>>>>>>>>>>> that this is not the right direction, as it only invites additional >>>>>>>>>>>> complexity and more ways for things to go wrong for no real benefit. >>>>>>>>>>>> However, I don't have the energy to argue that point any further, so I >>>>>>>>>>>> won't stand in the way here. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I assume you are most concerned with the >>>>>>>>>>> re-assembly/deserialization of the summary. My main goal is to get this >>>>>>>>>>> dumped into a text format, and I went this route since the last dumper RFC >>>>>>>>>>> was blocked with the LLVM assembly direction pushed. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yes, my main concern is with the deserialization. My view is that >>>>>>>>>> it should only be allowed for combined summaries -- allowing it for >>>>>>>>>> per-module is unnecessary as it creates the possibility of things gettting >>>>>>>>>> out of sync. Given that, we don't actually need an assembly representation >>>>>>>>>> and we can use whichever format is most convenient. But given the >>>>>>>>>> opposition to this viewpoint I am willing to concede getting the format >>>>>>>>>> (IMHO) right in favour of something that is acceptable to others, just so >>>>>>>>>> that we can test things in a more reasonable way. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On Tue, Apr 24, 2018 at 7:43 AM, Teresa Johnson < >>>>>>>>>>>> tejohnson at google.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi everyone, >>>>>>>>>>>>> >>>>>>>>>>>>> I started working on a long-standing request to have the >>>>>>>>>>>>> summary dumped in a readable format to text, and specifically to emit to >>>>>>>>>>>>> LLVM assembly. Proposal below, please let me know your thoughts. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Teresa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> *RFC: LLVM Assembly format for ThinLTO >>>>>>>>>>>>> Summary========================================Background-----------------ThinLTO >>>>>>>>>>>>> operates on small summaries computed during the compile step (i.e. with “-c >>>>>>>>>>>>> -flto=thin”), which are then analyzed and updated during the Thin Link >>>>>>>>>>>>> stage, and utilized to perform IR updates during the post-link ThinLTO >>>>>>>>>>>>> backends. The summaries are emitted as LLVM Bitcode, however, not currently >>>>>>>>>>>>> in the LLVM assembly.There are two ways to generate a bitcode file >>>>>>>>>>>>> containing summary records for a module: 1. Compile with “clang -c >>>>>>>>>>>>> -flto=thin”2. Build from LLVM assembly using “opt -module-summary”Either of >>>>>>>>>>>>> these will result in the ModuleSummaryIndex analysis pass (which builds the >>>>>>>>>>>>> summary index in memory for a module) to be added to the pipeline just >>>>>>>>>>>>> before bitcode emission.Additionally, a combined index is created by >>>>>>>>>>>>> merging all the per-module indexes during the Thin Link, which is >>>>>>>>>>>>> optionally emitted as a bitcode file.Currently, the only way to view these >>>>>>>>>>>>> records is via “llvm-bcanalyzer -dump”, then manually decoding the raw >>>>>>>>>>>>> bitcode dumps.Relatedly, there is YAML reader/writer support for CFI >>>>>>>>>>>>> related summary fields (-wholeprogramdevirt-read-summary and >>>>>>>>>>>>> -wholeprogramdevirt-write-summary). Last summer, GSOC student Charles >>>>>>>>>>>>> Saternos implemented support to dump the summary in YAML from llvm-lto2 >>>>>>>>>>>>> (D34080), including the rest of the summary fields (D34063), however, there >>>>>>>>>>>>> was pushback on the related RFC for dumping via YAML or another format >>>>>>>>>>>>> rather than emitting as LLVM assembly.Goals: 1. Define LLVM assembly format >>>>>>>>>>>>> for summary index2. Define interaction between parsing of summary from LLVM >>>>>>>>>>>>> assembly and synthesis of new summary index from IR.3. Implement printing >>>>>>>>>>>>> and parsing of summary index LLVM assemblyProposed LLVM Assembly >>>>>>>>>>>>> Format----------------------------------------------There are several top >>>>>>>>>>>>> level data structures within the ModuleSummaryIndex: 1. >>>>>>>>>>>>> ModulePathStringTable: Holds the paths to the modules summarized in the >>>>>>>>>>>>> index (only one entry for per-module indexes and multiple in the combined >>>>>>>>>>>>> index), along with their hashes (for incremental builds and global >>>>>>>>>>>>> promotion).2. GlobalValueMap: A map from global value GUIDs to the >>>>>>>>>>>>> corresponding function/variable/alias summary (or summaries for the >>>>>>>>>>>>> combined index and weak linkage).3. CFI-related data structures (TypeIdMap, >>>>>>>>>>>>> CfiFunctionDefs, and CfiFunctionDecls)I have a WIP patch to AsmWriter.cpp >>>>>>>>>>>>> to print the ModuleSummaryIndex that I was using to play with the format. >>>>>>>>>>>>> It currently prints 1 and 2 above. I’ve left the CFI related summary data >>>>>>>>>>>>> structures as a TODO for now, until the format is at least conceptually >>>>>>>>>>>>> agreed, but from looking at those I don’t see an issue with using the same >>>>>>>>>>>>> format (with a note/question for Peter on CFI type test representation >>>>>>>>>>>>> below).I modeled the proposed format on metadata, with a few key >>>>>>>>>>>>> differences noted below. Like metadata, I propose enumerating the entries >>>>>>>>>>>>> with the SlotTracker, and prefixing them with a special character. Avoiding >>>>>>>>>>>>> characters already used in some fashion (i.e. “!” for metadata and “#” for >>>>>>>>>>>>> attributes), I initially have chosen “^”. Open to suggestions >>>>>>>>>>>>> though.Consider the following example:extern void foo();int X;int bar() { >>>>>>>>>>>>> foo(); return X;}void barAlias() __attribute__ ((alias ("bar")));int >>>>>>>>>>>>> main() { barAlias(); return bar();}The proposed format has one entry per >>>>>>>>>>>>> ModulePathStringTable entry and one per GlobalValueMap/GUID, and looks >>>>>>>>>>>>> like:^0 = module: {path: testA.o, hash: 5487197307045666224}^1 = gv: {guid: >>>>>>>>>>>>> 1881667236089500162, name: X, summaries: {variable: {module: ^0, flags: >>>>>>>>>>>>> {linkage: common, notEligibleToImport: 0, live: 0, dsoLocal: 1}}}}^2 = gv: >>>>>>>>>>>>> {guid: 6699318081062747564, name: foo}^3 = gv: {guid: 15822663052811949562, >>>>>>>>>>>>> name: main, summaries: {function: {module: ^0, flags: {linkage: extern, >>>>>>>>>>>>> notEligibleToImport: 1, live: 0, dsoLocal: 1}, insts: 5, funcFlags: >>>>>>>>>>>>> {readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0}, calls: >>>>>>>>>>>>> {{callee: ^5, hotness: unknown}, {callee: ^4, hotness: unknown}}}}}^4 = gv: >>>>>>>>>>>>> {guid: 16434608426314478903, name: bar, summaries: {function: {module: ^0, >>>>>>>>>>>>> flags: {linkage: extern, notEligibleToImport: 1, live: 0, dsoLocal: 1}, >>>>>>>>>>>>> insts: 3, funcFlags: {readNone: 0, readOnly: 0, noRecurse: 0, >>>>>>>>>>>>> returnDoesNotAlias: 0}, calls: {{callee: ^2, hotness: unknown}}, refs: >>>>>>>>>>>>> {^1}}}}^5 = gv: {guid: 18040127437030252312, name: barAlias, summaries: >>>>>>>>>>>>> {alias: {module: ^0, flags: {linkage: extern, notEligibleToImport: 0, live: >>>>>>>>>>>>> 0, dsoLocal: 1}, aliasee: ^4}}}Like metadata, the fields are tagged >>>>>>>>>>>>> (currently using lower camel case, maybe upper camel case would be >>>>>>>>>>>>> preferable).The proposed format has a structure that reflects the data >>>>>>>>>>>>> structures in the summary index. For example, consider the entry “^4”. This >>>>>>>>>>>>> corresponds to the function “bar”. The entry for that GUID in the >>>>>>>>>>>>> GlobalValueMap contains a list of summaries. For per-module summaries such >>>>>>>>>>>>> as this, there will be at most one summary (with no summary list for an >>>>>>>>>>>>> external function like “foo”). In the combined summary there may be >>>>>>>>>>>>> multiple, e.g. in the case of linkonce_odr functions which have definitions >>>>>>>>>>>>> in multiple modules. The summary list for bar (“^4”) contains a >>>>>>>>>>>>> FunctionSummary, so the summary is tagged “function:”. The FunctionSummary >>>>>>>>>>>>> contains both a flags structure (inherited from the base GlobalValueSummary >>>>>>>>>>>>> class), and a funcFlags structure (specific to FunctionSummary). It >>>>>>>>>>>>> therefore contains a brace-enclosed list of flag tags/values for each.Where >>>>>>>>>>>>> a global value summary references another global value summary (e.g. via a >>>>>>>>>>>>> call list, reference list, or aliasee), the entry is referenced by its >>>>>>>>>>>>> slot. E.g. the alias “barAlias” (“^5”) references its aliasee “bar” as >>>>>>>>>>>>> “^4”.Note that in comparison metadata assembly entries tend to be much more >>>>>>>>>>>>> decomposed since many metadata fields are themselves metadata (so then >>>>>>>>>>>>> entries tend to be shorter with references to other metadata >>>>>>>>>>>>> nodes).Currently, I am emitting the summary entries at the end, after the >>>>>>>>>>>>> metadata nodes. Note that the ModuleSummaryIndex is not currently >>>>>>>>>>>>> referenced from the Module, and isn’t currently created when parsing the >>>>>>>>>>>>> Module IR bitcode (there is a separate derived class for reading the >>>>>>>>>>>>> ModuleSummaryIndex from bitcode). This is because they are not currently >>>>>>>>>>>>> used at the same time. However, in the future there is no reason why we >>>>>>>>>>>>> couldn’t tag the global values in the Module’s LLVM assembly with the >>>>>>>>>>>>> corresponding summary entry if the ModuleSummaryIndex is available when >>>>>>>>>>>>> printing the Module in the assembly writer. I.e. we could do the following >>>>>>>>>>>>> for “main” from the above example when printing the IR definition (note the >>>>>>>>>>>>> “^3” at the end):define dso_local i32 @main() #0 !dbg !17 ^3 {For CFI data >>>>>>>>>>>>> structures, the format would be similar. It appears that TypeIds are >>>>>>>>>>>>> referred to by string name in the top level TypeIdMap (std::map indexed by >>>>>>>>>>>>> std::string type identifier), whereas they are referenced by GUID within >>>>>>>>>>>>> the FunctionSummary class (i.e. the TypeTests vector and the VFuncId >>>>>>>>>>>>> structure). For the LLVM assembly I think there should be a top level entry >>>>>>>>>>>>> for each TypeIdMap, which lists both the type identifier string and its >>>>>>>>>>>>> GUID (followed by its associated information stored in the map), and the >>>>>>>>>>>>> TypeTests/VFuncId references on the FunctionSummary entries can reference >>>>>>>>>>>>> it by summary slot number. I.e. something like:^1 = typeid: {guid: 12345, >>>>>>>>>>>>> identifier: name_of_type, …^2 = gv: {... {function: {.... typeTests: {^1, >>>>>>>>>>>>> …Peter - is that correct and does that sound ok?* >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I don't think that would work because the purpose of the >>>>>>>>>>>> top-level TypeIdMap is to contain resolutions for each type identifier, and >>>>>>>>>>>> per-module summaries do not contain resolutions (only the combined summary >>>>>>>>>>>> does). What that means in practice is that we would not be able to recover >>>>>>>>>>>> and write out a type identifier name for per-module summaries as part of ^1 >>>>>>>>>>>> in your example (well, we could in principle, because the name is stored >>>>>>>>>>>> somewhere in the function's IR, but that could get complicated). >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Ah ok. I guess the top-level map then is generated by the >>>>>>>>>>> regular LTO portion of the link (since it presumably requires IR during the >>>>>>>>>>> Thin Link to get into the combined summary)? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yes, we fill in the map during the LowerTypeTests and >>>>>>>>>> WholeProgramDevirt passes in the regular LTO part of the link, e.g. here: >>>>>>>>>> http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/LowerTypeTests. >>>>>>>>>> cpp#823 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Probably the easiest thing to do is to keep the type identifiers >>>>>>>>>>>> as GUIDs in the function summaries and write out the mapping of type >>>>>>>>>>>> identifiers as a top-level entity. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> To confirm, you mean during the compile step create a top-level >>>>>>>>>>> entity that maps GUID -> identifier? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I mean that you could represent this with something like: >>>>>>>>>> >>>>>>>>>> ^typeids = {^1, ^2, ^3} >>>>>>>>>> ^1 = typeid: {identifier: typeid1, ...} >>>>>>>>>> ^2 = typeid: {identifier: typeid2, ...} >>>>>>>>>> ^3 = typeid: {identifier: typeid3, ...} >>>>>>>>>> >>>>>>>>>> There's no need to store the GUIDs here because they can be >>>>>>>>>> computed from the type identifiers. The GUIDs would only be stored in the >>>>>>>>>> typeTests (etc.) fields in each function summary. >>>>>>>>>> >>>>>>>>> >>>>>>>>> I suppose we don't need to store the GUIDs at the top level in the >>>>>>>>> in-memory summary. But I think it would be good to emit the GUIDs in the >>>>>>>>> typeid assembly entries because it makes the association in the assembly >>>>>>>>> much more obvious. I.e. going back to my original example: >>>>>>>>> >>>>>>>>> ^1 = typeid: {guid: 12345, identifier: name_of_type, … >>>>>>>>> ^2 = gv: {... {function: {.... typeTests: {^1, … >>>>>>>>> >>>>>>>>> If we didn't include the GUID in the typeid entry, but rather just >>>>>>>>> the identifier, and put the GUID in the typeTest list in the GV's entry, it >>>>>>>>> wouldn't be obvious at all from the assembly listing which typeid goes with >>>>>>>>> which typeTest. It's also less compact to include the GUID in each >>>>>>>>> typeTests list. >>>>>>>>> >>>>>>>> >>>>>>>> I get that, but my point was that in a per-module summary the >>>>>>>> TypeIdMap is empty, so there will be no names, only GUIDs. >>>>>>>> >>>>>>>> For "making the association more obvious" we might just want to >>>>>>>> have the assembly writer emit the GUID of a name as a comment. >>>>>>>> >>>>>>>> >>>>>>>>> Or perhaps we are saying the same thing - I can't tell from your >>>>>>>>> above example if the GUID is also emitted in the "typeid:" entries. >>>>>>>>> >>>>>>>> >>>>>>>> No, it wouldn't be. >>>>>>>> >>>>>>>> I'm not sure there is a need for the: >>>>>>>>> ^typeids = {^1, ^2, ^3} >>>>>>>>> We can just build the typeids list on the fly as " = typeid: " >>>>>>>>> entries are read in. >>>>>>>>> >>>>>>>> >>>>>>>> That's true. Given that nothing actually needs to refer to them, we >>>>>>>> can just represent the typeids as something like >>>>>>>> typeid: {identifier: typeid1, ...} ; guid = 123 >>>>>>>> typeid: {identifier: typeid2, ...} ; guid = 456 >>>>>>>> typeid: {identifier: typeid3, ...} ; guid = 789 >>>>>>>> without an associated number. >>>>>>>> >>>>>>>> Peter >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Teresa >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> Peter >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Teresa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Peter >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> *Issues when Parsing of Summaries from >>>>>>>>>>>>> Assembly--------------------------------------------------------------------When >>>>>>>>>>>>> reading an LLVM assembly file containing module summary entries, a >>>>>>>>>>>>> ModuleSummaryIndex will be created from the entries.Things to consider are >>>>>>>>>>>>> the behavior when: - Invoked with “opt -module-summary” (which currently >>>>>>>>>>>>> builds a new summary index from the IR). Options:1. recompute summary and >>>>>>>>>>>>> throw away summary in the assembly file2. ignore -module-summary and build >>>>>>>>>>>>> the summary from the LLVM assembly3. give an error4. compare the two >>>>>>>>>>>>> summaries (one created from the assembly and the new one created by the >>>>>>>>>>>>> analysis phase from the IR), and error if they are different.My opinion is >>>>>>>>>>>>> to do a), so that the behavior using -module-summary doesn’t change. We >>>>>>>>>>>>> also need a way to force building of a fresh module summary for cases where >>>>>>>>>>>>> the user has modified the LLVM assembly of the IR (see below). - How to >>>>>>>>>>>>> handle older LLVM assembly files that don’t contain new summary fields. >>>>>>>>>>>>> Options:1. Force the LLVM assembly file to be recreated with a new summary. >>>>>>>>>>>>> I.e. “opt -module-summary -o - | llvm-dis”.2. Auto-upgrade, by silently >>>>>>>>>>>>> creating conservative values for the new summary entries.I lean towards b) >>>>>>>>>>>>> (when possible) for user-friendliness and to reduce required churn on test >>>>>>>>>>>>> inputs. - How to handle partial or incorrect LLVM assembly summary entries. >>>>>>>>>>>>> How to handle partial summaries depends in part on how we answer the prior >>>>>>>>>>>>> question about auto-upgrading. I think the best option like there is to >>>>>>>>>>>>> handle it automatically when possible. However, I do think we should error >>>>>>>>>>>>> on glaring errors like obviously missing information. For example, when >>>>>>>>>>>>> there is summary data in the LLVM assembly, but summary entries are missing >>>>>>>>>>>>> for some global values. E.g. if the user modified the assembly to add a >>>>>>>>>>>>> function but forgot to add a corresponding summary entry. We could still >>>>>>>>>>>>> have subtle issues (e.g. user adds a new call but forgets to update the >>>>>>>>>>>>> caller’s summary call list), but it will be harder to detect those.* >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>>>>>>>>> 408-460-2413 <(408)%20460-2413> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> -- >>>>>>>>>>>> Peter >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>>>>>>> 408-460-2413 <(408)%20460-2413> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> -- >>>>>>>>>> Peter >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>>>>> 408-460-2413 <(408)%20460-2413> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> -- >>>>>>>> Peter >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>> 408-460-2413 <(408)%20460-2413> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> -- >>>> Peter >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> >> >> >> -- >> -- >> Peter >> >-- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180503/a4ad38af/attachment-0001.html>
David Blaikie via llvm-dev
2018-May-03 23:12 UTC
[llvm-dev] RFC: LLVM Assembly format for ThinLTO Summary
On Thu, May 3, 2018 at 4:09 PM Peter Collingbourne <peter at pcc.me.uk> wrote:> On Thu, May 3, 2018 at 3:58 PM, David Blaikie <dblaikie at gmail.com> wrote: > >> >> >> On Thu, May 3, 2018 at 3:52 PM Peter Collingbourne <peter at pcc.me.uk> >> wrote: >> >>> On Thu, May 3, 2018 at 3:29 PM, David Blaikie <dblaikie at gmail.com> >>> wrote: >>> >>>> >>>> >>>> On Thu, May 3, 2018 at 3:08 PM Peter Collingbourne via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>>>> On Thu, May 3, 2018 at 2:44 PM, Mehdi AMINI <joker.eph at gmail.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> Le mar. 1 mai 2018 à 16:50, Teresa Johnson <tejohnson at google.com> a >>>>>> écrit : >>>>>> >>>>>>> Hi Mehdi, thanks for the comments, responses and a tweaked proposal >>>>>>> below. Teresa >>>>>>> >>>>>>> On Tue, May 1, 2018 at 11:37 AM, Mehdi AMINI <joker.eph at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> My main concern is this one: >>>>>>>> >>>>>>>> > Currently, I am emitting the summary entries at the end, after >>>>>>>> the metadata nodes. Note that the ModuleSummaryIndex is not currently >>>>>>>> referenced from the Module, and isn’t currently created when parsing the >>>>>>>> Module IR bitcode (there is a separate derived class for reading the >>>>>>>> ModuleSummaryIndex from bitcode). This is because they are not currently >>>>>>>> used at the same time. However, in the future there is no reason why we >>>>>>>> couldn’t tag the global values in the Module’s LLVM assembly with the >>>>>>>> corresponding summary entry if the ModuleSummaryIndex is available when >>>>>>>> printing the Module in the assembly writer. I.e. we could do the following >>>>>>>> for “main” from the above example when printing the IR definition (note the >>>>>>>> “^3” at the end): >>>>>>>> >>>>>>>> >>>>>>>> I believe the reason that the ModuleSummaryIndex is not attached to >>>>>>>> the Module is that it is fundamentally not a piece of IR, but it is >>>>>>>> conceptually really an Analysis result. >>>>>>>> Usually other analyses don't serialize their result, we happen to >>>>>>>> serialize this one for an optimization purpose (reloading it and making the >>>>>>>> thin-link faster). >>>>>>>> >>>>>>> >>>>>>> True. My understanding is that the push for having it serialized via >>>>>>> assembly is due to the fact that it is emitted into the bitcode. I know >>>>>>> there is disagreement on this reasoning, I am hoping to have a proposal >>>>>>> that is acceptable to everyone. =) >>>>>>> >>>>>>> >>>>>>> >>>>>>>> The fundamental problem is that an analysis result has to be able >>>>>>>> to be invalidated with IR changes, attaching this directly to the module >>>>>>>> wouldn't achieve this. The risk is that when the IR and the summary get >>>>>>>> out-of-sync (`clang -O2 my_module_with_summaries.ll -emit-llvm -o >>>>>>>> my_optimized module_with_summaries.ll`) the summaries would be badly wrong. >>>>>>>> >>>>>>>> Have you looked into what it'd take to make it a "real" analysis in >>>>>>>> the pass manager? >>>>>>>> >>>>>>> >>>>>>> Thanks for raising this issue specifically, I hadn't addressed it in >>>>>>> my proposal and it is a big one. I am not proposing that we attempt to >>>>>>> maintain the summary through optimization passes, and definitely don't >>>>>>> think we should do that. IMO deserializing it should be for testing the >>>>>>> thin link and the combined summaries in the backends only. To that end, I >>>>>>> have an idea (below some background first). >>>>>>> >>>>>>> Note that in some cases the module summary analysis is an analysis >>>>>>> pass. I.e. when invoked by "opt -module-summary=". However, some time ago >>>>>>> when Peter added the support for splitting the bitcode (for CFI purposes) >>>>>>> and therefore needed to generate a summary in each partition (Thin and >>>>>>> Regular), he added the ThinLTOBitcodeWriterPass, which invokes the module >>>>>>> summary builder directly (twice). This writer is what gets invoked now when >>>>>>> building via "clang -flto=thin", and with "opt -thinlto-bc". So there it is >>>>>>> not invoked/maintained as an analysis pass/result. It would be tricky to >>>>>>> figure out how to even split rather than recompute the module summary index >>>>>>> in that case. Even in the case where we are still invoking as an analysis >>>>>>> pass (opt -module-summary), we would need to figure out how to read in the >>>>>>> module summary to use as the analysis result when available (so that it >>>>>>> could be invalidated and recomputed when stale). >>>>>>> >>>>>>> Rather than add this plumbing, and just have it discarded if opt >>>>>>> does any optimization, I think we should focus at least for the time being >>>>>>> on supporting reading the summary from assembly exactly where we currently >>>>>>> read in the summary from bitcode: >>>>>>> 1) For the thin link (e.g. tools such as llvm-lto2 or llvm-lto, >>>>>>> which currently have to be preceded by "opt -module-summary/-thinlto-bc" to >>>>>>> generate an index, but could just build it from assembly instead). >>>>>>> 2) For the LTO backends (e.g. tools such as llvm-lto which can >>>>>>> consume a combined index and invoke the backends, or "clang >>>>>>> -fthinlto-index=" for distributed ThinLTO backend testing), where we could >>>>>>> build the combined summary index from assembly instead. >>>>>>> >>>>>>> This greatly simplifies the reading side, as there are no >>>>>>> optimizations performed on the IR after the index is read in these cases >>>>>>> that would require invalidation. It also simplifies adding the parsing >>>>>>> support, since it gets invoked exactly where we expect to build an index >>>>>>> currently (i.e. since we don't currently build or store the >>>>>>> ModuleSummaryIndex when parsing the Module from bitcode). It doesn't >>>>>>> preclude someone from figuring out how to compute the module summary >>>>>>> analysis result from the assembly, and invalidating it after optimization, >>>>>>> when reading the Module IR via 'opt' in the future. >>>>>>> >>>>>>> Does this seem like a reasonable proposal to everyone? >>>>>>> >>>>>> >>>>>> Sounds good to me. >>>>>> >>>>>> That would make .ll files quite convenient during debugging I think? >>>>>> We could disassemble, manually change summaries, and re-assemble a bitcode >>>>>> file before running the (thin-)link again. >>>>>> >>>>> >>>>> Yes, that seems more reasonable than what I thought you had in mind. >>>>> If the only consumer of this information is llvm-as, then the purpose of >>>>> the asm summary format is just to provide a way to create a .bc file for >>>>> testing purposes, which is certainly a useful capability. >>>>> >>>> >>>> That feels a bit surprising if ".ll -> llvm-as -> .bc -> <some other >>>> tool - opt, etc>" is different from ".ll -> <opt, etc>". Is that what we're >>>> talking about here? Any chance that can be avoided & feeding a .ll file >>>> works (in the sense of does the same thing/tests the same behavior) in all >>>> the same places that feeding a .bc file does? (as is the case with >>>> non-summary-based IR, to the best of my knowledge) >>>> >>> >>> I may be mistaken, but I don't think we have a lot of tools that can >>> read both .ll and .bc and end up using the summary if it is a .bc file. LTO >>> can't read .ll, for example. >>> >> >> Oh, I didn't know that - why doesn't it read .ll? That seems like an >> oversight - most/all the other LLVM tools treat .ll and .bc pretty equally, >> don't they? >> > > Yes, the developer-facing tools do (clang being one of them, as it has > some developer-facing features), but linkers (the LTO API consumers) are > user-facing, and have no need to consume .ll files. Getting them to read > .ll has other complications as well. For example, Unix linkers will > interpret textual input files as linker scripts, so there would need to be > some mechanism to distinguish .ll files from linker scripts. >Yeah, no worries about the linker - though I guess it could be convenient if it did some lookahead & could detect a .ll file as being different from a linker script - but that's a "nice to have"/heroics, not something I'd expect.> llvm-lto2 (the LTO API test driver) is a developer-facing tool, and we > could conceivably make it read .ll files and pass them to the API as .bc > files. >Right - this was more what I had in mind. Basically I'd find it weird if I came across an LLVM test case that first ran a .ll file through llvm-as, then ran it on the tool (llvm-lto2) in question. I'd likely try to remove the indirection & be a bit annoyed if that failed. - Dave> > Peter > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180503/7c58bb16/attachment.html>