Peter Collingbourne via llvm-dev
2018-May-03 22:52 UTC
[llvm-dev] RFC: LLVM Assembly format for ThinLTO Summary
On Thu, May 3, 2018 at 3:29 PM, David Blaikie <dblaikie at gmail.com> wrote:> > > On Thu, May 3, 2018 at 3:08 PM Peter Collingbourne via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> On Thu, May 3, 2018 at 2:44 PM, Mehdi AMINI <joker.eph at gmail.com> wrote: >> >>> >>> >>> Le mar. 1 mai 2018 à 16:50, Teresa Johnson <tejohnson at google.com> a >>> écrit : >>> >>>> Hi Mehdi, thanks for the comments, responses and a tweaked proposal >>>> below. Teresa >>>> >>>> On Tue, May 1, 2018 at 11:37 AM, Mehdi AMINI <joker.eph at gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> My main concern is this one: >>>>> >>>>> > Currently, I am emitting the summary entries at the end, after the >>>>> metadata nodes. Note that the ModuleSummaryIndex is not currently >>>>> referenced from the Module, and isn’t currently created when parsing the >>>>> Module IR bitcode (there is a separate derived class for reading the >>>>> ModuleSummaryIndex from bitcode). This is because they are not currently >>>>> used at the same time. However, in the future there is no reason why we >>>>> couldn’t tag the global values in the Module’s LLVM assembly with the >>>>> corresponding summary entry if the ModuleSummaryIndex is available when >>>>> printing the Module in the assembly writer. I.e. we could do the following >>>>> for “main” from the above example when printing the IR definition (note the >>>>> “^3” at the end): >>>>> >>>>> >>>>> I believe the reason that the ModuleSummaryIndex is not attached to >>>>> the Module is that it is fundamentally not a piece of IR, but it is >>>>> conceptually really an Analysis result. >>>>> Usually other analyses don't serialize their result, we happen to >>>>> serialize this one for an optimization purpose (reloading it and making the >>>>> thin-link faster). >>>>> >>>> >>>> True. My understanding is that the push for having it serialized via >>>> assembly is due to the fact that it is emitted into the bitcode. I know >>>> there is disagreement on this reasoning, I am hoping to have a proposal >>>> that is acceptable to everyone. =) >>>> >>>> >>>> >>>>> The fundamental problem is that an analysis result has to be able to >>>>> be invalidated with IR changes, attaching this directly to the module >>>>> wouldn't achieve this. The risk is that when the IR and the summary get >>>>> out-of-sync (`clang -O2 my_module_with_summaries.ll -emit-llvm -o >>>>> my_optimized module_with_summaries.ll`) the summaries would be badly wrong. >>>>> >>>>> Have you looked into what it'd take to make it a "real" analysis in >>>>> the pass manager? >>>>> >>>> >>>> Thanks for raising this issue specifically, I hadn't addressed it in my >>>> proposal and it is a big one. I am not proposing that we attempt to >>>> maintain the summary through optimization passes, and definitely don't >>>> think we should do that. IMO deserializing it should be for testing the >>>> thin link and the combined summaries in the backends only. To that end, I >>>> have an idea (below some background first). >>>> >>>> Note that in some cases the module summary analysis is an analysis >>>> pass. I.e. when invoked by "opt -module-summary=". However, some time ago >>>> when Peter added the support for splitting the bitcode (for CFI purposes) >>>> and therefore needed to generate a summary in each partition (Thin and >>>> Regular), he added the ThinLTOBitcodeWriterPass, which invokes the module >>>> summary builder directly (twice). This writer is what gets invoked now when >>>> building via "clang -flto=thin", and with "opt -thinlto-bc". So there it is >>>> not invoked/maintained as an analysis pass/result. It would be tricky to >>>> figure out how to even split rather than recompute the module summary index >>>> in that case. Even in the case where we are still invoking as an analysis >>>> pass (opt -module-summary), we would need to figure out how to read in the >>>> module summary to use as the analysis result when available (so that it >>>> could be invalidated and recomputed when stale). >>>> >>>> Rather than add this plumbing, and just have it discarded if opt does >>>> any optimization, I think we should focus at least for the time being on >>>> supporting reading the summary from assembly exactly where we currently >>>> read in the summary from bitcode: >>>> 1) For the thin link (e.g. tools such as llvm-lto2 or llvm-lto, which >>>> currently have to be preceded by "opt -module-summary/-thinlto-bc" to >>>> generate an index, but could just build it from assembly instead). >>>> 2) For the LTO backends (e.g. tools such as llvm-lto which can consume >>>> a combined index and invoke the backends, or "clang -fthinlto-index=" for >>>> distributed ThinLTO backend testing), where we could build the combined >>>> summary index from assembly instead. >>>> >>>> This greatly simplifies the reading side, as there are no optimizations >>>> performed on the IR after the index is read in these cases that would >>>> require invalidation. It also simplifies adding the parsing support, since >>>> it gets invoked exactly where we expect to build an index currently (i.e. >>>> since we don't currently build or store the ModuleSummaryIndex when parsing >>>> the Module from bitcode). It doesn't preclude someone from figuring out how >>>> to compute the module summary analysis result from the assembly, and >>>> invalidating it after optimization, when reading the Module IR via 'opt' in >>>> the future. >>>> >>>> Does this seem like a reasonable proposal to everyone? >>>> >>> >>> Sounds good to me. >>> >>> That would make .ll files quite convenient during debugging I think? We >>> could disassemble, manually change summaries, and re-assemble a bitcode >>> file before running the (thin-)link again. >>> >> >> Yes, that seems more reasonable than what I thought you had in mind. If >> the only consumer of this information is llvm-as, then the purpose of the >> asm summary format is just to provide a way to create a .bc file for >> testing purposes, which is certainly a useful capability. >> > > That feels a bit surprising if ".ll -> llvm-as -> .bc -> <some other tool > - opt, etc>" is different from ".ll -> <opt, etc>". Is that what we're > talking about here? Any chance that can be avoided & feeding a .ll file > works (in the sense of does the same thing/tests the same behavior) in all > the same places that feeding a .bc file does? (as is the case with > non-summary-based IR, to the best of my knowledge) >I may be mistaken, but I don't think we have a lot of tools that can read both .ll and .bc and end up using the summary if it is a .bc file. LTO can't read .ll, for example. The only one that I can think of is clang and presumably we could make that use whichever API we would use in llvm-as for reading the summary from .ll. So the behaviour of most tools would be that ".ll -> llvm-as -> .bc -> <some tool>" vs ".ll -> <some tool>" would end up being the same in both cases: the summary gets discarded. Peter> >> >> Peter >> >> >>> Thanks, >>> >>> -- >>> Mehdi >>> >>> >>> >>> >>> >>>> >>>> Thanks, >>>> Teresa >>>> >>>> >>>> >>>>> Cheers, >>>>> >>>>> -- >>>>> Mehdi >>>>> >>>>> >>>>> >>>>> >>>>> Le mar. 1 mai 2018 à 11:10, Peter Collingbourne <peter at pcc.me.uk> a >>>>> écrit : >>>>> >>>>>> >>>>>> >>>>>> On Tue, May 1, 2018 at 9:00 AM, Teresa Johnson <tejohnson at google.com> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Apr 30, 2018 at 10:21 AM Peter Collingbourne < >>>>>>> peter at pcc.me.uk> wrote: >>>>>>> >>>>>>>> On Mon, Apr 30, 2018 at 8:32 AM, Teresa Johnson < >>>>>>>> tejohnson at google.com> wrote: >>>>>>>> >>>>>>>>> Hi Peter, >>>>>>>>> Thanks for your comments, replies below. >>>>>>>>> Teresa >>>>>>>>> >>>>>>>>> On Wed, Apr 25, 2018 at 2:08 PM Peter Collingbourne < >>>>>>>>> peter at pcc.me.uk> wrote: >>>>>>>>> >>>>>>>>>> Hi Teresa, >>>>>>>>>> >>>>>>>>>> Thanks for sending this proposal out. >>>>>>>>>> >>>>>>>>>> I would again like to register my disagreement with the whole >>>>>>>>>> idea of writing summaries in LLVM assembly format. In my view it is clear >>>>>>>>>> that this is not the right direction, as it only invites additional >>>>>>>>>> complexity and more ways for things to go wrong for no real benefit. >>>>>>>>>> However, I don't have the energy to argue that point any further, so I >>>>>>>>>> won't stand in the way here. >>>>>>>>>> >>>>>>>>> >>>>>>>>> I assume you are most concerned with the >>>>>>>>> re-assembly/deserialization of the summary. My main goal is to get this >>>>>>>>> dumped into a text format, and I went this route since the last dumper RFC >>>>>>>>> was blocked with the LLVM assembly direction pushed. >>>>>>>>> >>>>>>>> >>>>>>>> Yes, my main concern is with the deserialization. My view is that >>>>>>>> it should only be allowed for combined summaries -- allowing it for >>>>>>>> per-module is unnecessary as it creates the possibility of things gettting >>>>>>>> out of sync. Given that, we don't actually need an assembly representation >>>>>>>> and we can use whichever format is most convenient. But given the >>>>>>>> opposition to this viewpoint I am willing to concede getting the format >>>>>>>> (IMHO) right in favour of something that is acceptable to others, just so >>>>>>>> that we can test things in a more reasonable way. >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Tue, Apr 24, 2018 at 7:43 AM, Teresa Johnson < >>>>>>>>>> tejohnson at google.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi everyone, >>>>>>>>>>> >>>>>>>>>>> I started working on a long-standing request to have the summary >>>>>>>>>>> dumped in a readable format to text, and specifically to emit to LLVM >>>>>>>>>>> assembly. Proposal below, please let me know your thoughts. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Teresa >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *RFC: LLVM Assembly format for ThinLTO >>>>>>>>>>> Summary========================================Background-----------------ThinLTO >>>>>>>>>>> operates on small summaries computed during the compile step (i.e. with “-c >>>>>>>>>>> -flto=thin”), which are then analyzed and updated during the Thin Link >>>>>>>>>>> stage, and utilized to perform IR updates during the post-link ThinLTO >>>>>>>>>>> backends. The summaries are emitted as LLVM Bitcode, however, not currently >>>>>>>>>>> in the LLVM assembly.There are two ways to generate a bitcode file >>>>>>>>>>> containing summary records for a module: 1. Compile with “clang -c >>>>>>>>>>> -flto=thin”2. Build from LLVM assembly using “opt -module-summary”Either of >>>>>>>>>>> these will result in the ModuleSummaryIndex analysis pass (which builds the >>>>>>>>>>> summary index in memory for a module) to be added to the pipeline just >>>>>>>>>>> before bitcode emission.Additionally, a combined index is created by >>>>>>>>>>> merging all the per-module indexes during the Thin Link, which is >>>>>>>>>>> optionally emitted as a bitcode file.Currently, the only way to view these >>>>>>>>>>> records is via “llvm-bcanalyzer -dump”, then manually decoding the raw >>>>>>>>>>> bitcode dumps.Relatedly, there is YAML reader/writer support for CFI >>>>>>>>>>> related summary fields (-wholeprogramdevirt-read-summary and >>>>>>>>>>> -wholeprogramdevirt-write-summary). Last summer, GSOC student Charles >>>>>>>>>>> Saternos implemented support to dump the summary in YAML from llvm-lto2 >>>>>>>>>>> (D34080), including the rest of the summary fields (D34063), however, there >>>>>>>>>>> was pushback on the related RFC for dumping via YAML or another format >>>>>>>>>>> rather than emitting as LLVM assembly.Goals: 1. Define LLVM assembly format >>>>>>>>>>> for summary index2. Define interaction between parsing of summary from LLVM >>>>>>>>>>> assembly and synthesis of new summary index from IR.3. Implement printing >>>>>>>>>>> and parsing of summary index LLVM assemblyProposed LLVM Assembly >>>>>>>>>>> Format----------------------------------------------There are several top >>>>>>>>>>> level data structures within the ModuleSummaryIndex: 1. >>>>>>>>>>> ModulePathStringTable: Holds the paths to the modules summarized in the >>>>>>>>>>> index (only one entry for per-module indexes and multiple in the combined >>>>>>>>>>> index), along with their hashes (for incremental builds and global >>>>>>>>>>> promotion).2. GlobalValueMap: A map from global value GUIDs to the >>>>>>>>>>> corresponding function/variable/alias summary (or summaries for the >>>>>>>>>>> combined index and weak linkage).3. CFI-related data structures (TypeIdMap, >>>>>>>>>>> CfiFunctionDefs, and CfiFunctionDecls)I have a WIP patch to AsmWriter.cpp >>>>>>>>>>> to print the ModuleSummaryIndex that I was using to play with the format. >>>>>>>>>>> It currently prints 1 and 2 above. I’ve left the CFI related summary data >>>>>>>>>>> structures as a TODO for now, until the format is at least conceptually >>>>>>>>>>> agreed, but from looking at those I don’t see an issue with using the same >>>>>>>>>>> format (with a note/question for Peter on CFI type test representation >>>>>>>>>>> below).I modeled the proposed format on metadata, with a few key >>>>>>>>>>> differences noted below. Like metadata, I propose enumerating the entries >>>>>>>>>>> with the SlotTracker, and prefixing them with a special character. Avoiding >>>>>>>>>>> characters already used in some fashion (i.e. “!” for metadata and “#” for >>>>>>>>>>> attributes), I initially have chosen “^”. Open to suggestions >>>>>>>>>>> though.Consider the following example:extern void foo();int X;int bar() { >>>>>>>>>>> foo(); return X;}void barAlias() __attribute__ ((alias ("bar")));int >>>>>>>>>>> main() { barAlias(); return bar();}The proposed format has one entry per >>>>>>>>>>> ModulePathStringTable entry and one per GlobalValueMap/GUID, and looks >>>>>>>>>>> like:^0 = module: {path: testA.o, hash: 5487197307045666224}^1 = gv: {guid: >>>>>>>>>>> 1881667236089500162, name: X, summaries: {variable: {module: ^0, flags: >>>>>>>>>>> {linkage: common, notEligibleToImport: 0, live: 0, dsoLocal: 1}}}}^2 = gv: >>>>>>>>>>> {guid: 6699318081062747564, name: foo}^3 = gv: {guid: 15822663052811949562, >>>>>>>>>>> name: main, summaries: {function: {module: ^0, flags: {linkage: extern, >>>>>>>>>>> notEligibleToImport: 1, live: 0, dsoLocal: 1}, insts: 5, funcFlags: >>>>>>>>>>> {readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0}, calls: >>>>>>>>>>> {{callee: ^5, hotness: unknown}, {callee: ^4, hotness: unknown}}}}}^4 = gv: >>>>>>>>>>> {guid: 16434608426314478903, name: bar, summaries: {function: {module: ^0, >>>>>>>>>>> flags: {linkage: extern, notEligibleToImport: 1, live: 0, dsoLocal: 1}, >>>>>>>>>>> insts: 3, funcFlags: {readNone: 0, readOnly: 0, noRecurse: 0, >>>>>>>>>>> returnDoesNotAlias: 0}, calls: {{callee: ^2, hotness: unknown}}, refs: >>>>>>>>>>> {^1}}}}^5 = gv: {guid: 18040127437030252312, name: barAlias, summaries: >>>>>>>>>>> {alias: {module: ^0, flags: {linkage: extern, notEligibleToImport: 0, live: >>>>>>>>>>> 0, dsoLocal: 1}, aliasee: ^4}}}Like metadata, the fields are tagged >>>>>>>>>>> (currently using lower camel case, maybe upper camel case would be >>>>>>>>>>> preferable).The proposed format has a structure that reflects the data >>>>>>>>>>> structures in the summary index. For example, consider the entry “^4”. This >>>>>>>>>>> corresponds to the function “bar”. The entry for that GUID in the >>>>>>>>>>> GlobalValueMap contains a list of summaries. For per-module summaries such >>>>>>>>>>> as this, there will be at most one summary (with no summary list for an >>>>>>>>>>> external function like “foo”). In the combined summary there may be >>>>>>>>>>> multiple, e.g. in the case of linkonce_odr functions which have definitions >>>>>>>>>>> in multiple modules. The summary list for bar (“^4”) contains a >>>>>>>>>>> FunctionSummary, so the summary is tagged “function:”. The FunctionSummary >>>>>>>>>>> contains both a flags structure (inherited from the base GlobalValueSummary >>>>>>>>>>> class), and a funcFlags structure (specific to FunctionSummary). It >>>>>>>>>>> therefore contains a brace-enclosed list of flag tags/values for each.Where >>>>>>>>>>> a global value summary references another global value summary (e.g. via a >>>>>>>>>>> call list, reference list, or aliasee), the entry is referenced by its >>>>>>>>>>> slot. E.g. the alias “barAlias” (“^5”) references its aliasee “bar” as >>>>>>>>>>> “^4”.Note that in comparison metadata assembly entries tend to be much more >>>>>>>>>>> decomposed since many metadata fields are themselves metadata (so then >>>>>>>>>>> entries tend to be shorter with references to other metadata >>>>>>>>>>> nodes).Currently, I am emitting the summary entries at the end, after the >>>>>>>>>>> metadata nodes. Note that the ModuleSummaryIndex is not currently >>>>>>>>>>> referenced from the Module, and isn’t currently created when parsing the >>>>>>>>>>> Module IR bitcode (there is a separate derived class for reading the >>>>>>>>>>> ModuleSummaryIndex from bitcode). This is because they are not currently >>>>>>>>>>> used at the same time. However, in the future there is no reason why we >>>>>>>>>>> couldn’t tag the global values in the Module’s LLVM assembly with the >>>>>>>>>>> corresponding summary entry if the ModuleSummaryIndex is available when >>>>>>>>>>> printing the Module in the assembly writer. I.e. we could do the following >>>>>>>>>>> for “main” from the above example when printing the IR definition (note the >>>>>>>>>>> “^3” at the end):define dso_local i32 @main() #0 !dbg !17 ^3 {For CFI data >>>>>>>>>>> structures, the format would be similar. It appears that TypeIds are >>>>>>>>>>> referred to by string name in the top level TypeIdMap (std::map indexed by >>>>>>>>>>> std::string type identifier), whereas they are referenced by GUID within >>>>>>>>>>> the FunctionSummary class (i.e. the TypeTests vector and the VFuncId >>>>>>>>>>> structure). For the LLVM assembly I think there should be a top level entry >>>>>>>>>>> for each TypeIdMap, which lists both the type identifier string and its >>>>>>>>>>> GUID (followed by its associated information stored in the map), and the >>>>>>>>>>> TypeTests/VFuncId references on the FunctionSummary entries can reference >>>>>>>>>>> it by summary slot number. I.e. something like:^1 = typeid: {guid: 12345, >>>>>>>>>>> identifier: name_of_type, …^2 = gv: {... {function: {.... typeTests: {^1, >>>>>>>>>>> …Peter - is that correct and does that sound ok?* >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I don't think that would work because the purpose of the >>>>>>>>>> top-level TypeIdMap is to contain resolutions for each type identifier, and >>>>>>>>>> per-module summaries do not contain resolutions (only the combined summary >>>>>>>>>> does). What that means in practice is that we would not be able to recover >>>>>>>>>> and write out a type identifier name for per-module summaries as part of ^1 >>>>>>>>>> in your example (well, we could in principle, because the name is stored >>>>>>>>>> somewhere in the function's IR, but that could get complicated). >>>>>>>>>> >>>>>>>>> >>>>>>>>> Ah ok. I guess the top-level map then is generated by the regular >>>>>>>>> LTO portion of the link (since it presumably requires IR during the Thin >>>>>>>>> Link to get into the combined summary)? >>>>>>>>> >>>>>>>> >>>>>>>> Yes, we fill in the map during the LowerTypeTests and >>>>>>>> WholeProgramDevirt passes in the regular LTO part of the link, e.g. here: >>>>>>>> http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/LowerTypeTests.cpp#823 >>>>>>>> >>>>>>>> >>>>>>>>> Probably the easiest thing to do is to keep the type identifiers >>>>>>>>>> as GUIDs in the function summaries and write out the mapping of type >>>>>>>>>> identifiers as a top-level entity. >>>>>>>>>> >>>>>>>>> >>>>>>>>> To confirm, you mean during the compile step create a top-level >>>>>>>>> entity that maps GUID -> identifier? >>>>>>>>> >>>>>>>> >>>>>>>> I mean that you could represent this with something like: >>>>>>>> >>>>>>>> ^typeids = {^1, ^2, ^3} >>>>>>>> ^1 = typeid: {identifier: typeid1, ...} >>>>>>>> ^2 = typeid: {identifier: typeid2, ...} >>>>>>>> ^3 = typeid: {identifier: typeid3, ...} >>>>>>>> >>>>>>>> There's no need to store the GUIDs here because they can be >>>>>>>> computed from the type identifiers. The GUIDs would only be stored in the >>>>>>>> typeTests (etc.) fields in each function summary. >>>>>>>> >>>>>>> >>>>>>> I suppose we don't need to store the GUIDs at the top level in the >>>>>>> in-memory summary. But I think it would be good to emit the GUIDs in the >>>>>>> typeid assembly entries because it makes the association in the assembly >>>>>>> much more obvious. I.e. going back to my original example: >>>>>>> >>>>>>> ^1 = typeid: {guid: 12345, identifier: name_of_type, … >>>>>>> ^2 = gv: {... {function: {.... typeTests: {^1, … >>>>>>> >>>>>>> If we didn't include the GUID in the typeid entry, but rather just >>>>>>> the identifier, and put the GUID in the typeTest list in the GV's entry, it >>>>>>> wouldn't be obvious at all from the assembly listing which typeid goes with >>>>>>> which typeTest. It's also less compact to include the GUID in each >>>>>>> typeTests list. >>>>>>> >>>>>> >>>>>> I get that, but my point was that in a per-module summary the >>>>>> TypeIdMap is empty, so there will be no names, only GUIDs. >>>>>> >>>>>> For "making the association more obvious" we might just want to have >>>>>> the assembly writer emit the GUID of a name as a comment. >>>>>> >>>>>> >>>>>>> Or perhaps we are saying the same thing - I can't tell from your >>>>>>> above example if the GUID is also emitted in the "typeid:" entries. >>>>>>> >>>>>> >>>>>> No, it wouldn't be. >>>>>> >>>>>> I'm not sure there is a need for the: >>>>>>> ^typeids = {^1, ^2, ^3} >>>>>>> We can just build the typeids list on the fly as " = typeid: " >>>>>>> entries are read in. >>>>>>> >>>>>> >>>>>> That's true. Given that nothing actually needs to refer to them, we >>>>>> can just represent the typeids as something like >>>>>> typeid: {identifier: typeid1, ...} ; guid = 123 >>>>>> typeid: {identifier: typeid2, ...} ; guid = 456 >>>>>> typeid: {identifier: typeid3, ...} ; guid = 789 >>>>>> without an associated number. >>>>>> >>>>>> Peter >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>> Teresa >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Peter >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Teresa >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Peter >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *Issues when Parsing of Summaries from >>>>>>>>>>> Assembly--------------------------------------------------------------------When >>>>>>>>>>> reading an LLVM assembly file containing module summary entries, a >>>>>>>>>>> ModuleSummaryIndex will be created from the entries.Things to consider are >>>>>>>>>>> the behavior when: - Invoked with “opt -module-summary” (which currently >>>>>>>>>>> builds a new summary index from the IR). Options:1. recompute summary and >>>>>>>>>>> throw away summary in the assembly file2. ignore -module-summary and build >>>>>>>>>>> the summary from the LLVM assembly3. give an error4. compare the two >>>>>>>>>>> summaries (one created from the assembly and the new one created by the >>>>>>>>>>> analysis phase from the IR), and error if they are different.My opinion is >>>>>>>>>>> to do a), so that the behavior using -module-summary doesn’t change. We >>>>>>>>>>> also need a way to force building of a fresh module summary for cases where >>>>>>>>>>> the user has modified the LLVM assembly of the IR (see below). - How to >>>>>>>>>>> handle older LLVM assembly files that don’t contain new summary fields. >>>>>>>>>>> Options:1. Force the LLVM assembly file to be recreated with a new summary. >>>>>>>>>>> I.e. “opt -module-summary -o - | llvm-dis”.2. Auto-upgrade, by silently >>>>>>>>>>> creating conservative values for the new summary entries.I lean towards b) >>>>>>>>>>> (when possible) for user-friendliness and to reduce required churn on test >>>>>>>>>>> inputs. - How to handle partial or incorrect LLVM assembly summary entries. >>>>>>>>>>> How to handle partial summaries depends in part on how we answer the prior >>>>>>>>>>> question about auto-upgrading. I think the best option like there is to >>>>>>>>>>> handle it automatically when possible. However, I do think we should error >>>>>>>>>>> on glaring errors like obviously missing information. For example, when >>>>>>>>>>> there is summary data in the LLVM assembly, but summary entries are missing >>>>>>>>>>> for some global values. E.g. if the user modified the assembly to add a >>>>>>>>>>> function but forgot to add a corresponding summary entry. We could still >>>>>>>>>>> have subtle issues (e.g. user adds a new call but forgets to update the >>>>>>>>>>> caller’s summary call list), but it will be harder to detect those.* >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>>>>>>> 408-460-2413 <(408)%20460-2413> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> -- >>>>>>>>>> Peter >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>>>>> 408-460-2413 <(408)%20460-2413> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> -- >>>>>>>> Peter >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>>>> 408-460-2413 <(408)%20460-2413> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> -- >>>>>> Peter >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>> 408-460-2413 <(408)%20460-2413> >>>> >>> >> >> >> -- >> -- >> Peter >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180503/18719d92/attachment-0001.html>
Teresa Johnson via llvm-dev
2018-May-03 23:06 UTC
[llvm-dev] RFC: LLVM Assembly format for ThinLTO Summary
On Thu, May 3, 2018 at 3:54 PM, Peter Collingbourne via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Re-sending with trimmed quotation. > > On Thu, May 3, 2018 at 3:52 PM, Peter Collingbourne <peter at pcc.me.uk> > wrote: > >> >> >> On Thu, May 3, 2018 at 3:29 PM, David Blaikie <dblaikie at gmail.com> wrote: >> >>> >>> >>> On Thu, May 3, 2018 at 3:08 PM Peter Collingbourne via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> On Thu, May 3, 2018 at 2:44 PM, Mehdi AMINI <joker.eph at gmail.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> Le mar. 1 mai 2018 à 16:50, Teresa Johnson <tejohnson at google.com> a >>>>> écrit : >>>>> >>>>>> Hi Mehdi, thanks for the comments, responses and a tweaked proposal >>>>>> below. Teresa >>>>>> >>>>>> On Tue, May 1, 2018 at 11:37 AM, Mehdi AMINI <joker.eph at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> My main concern is this one: >>>>>>> >>>>>>> > Currently, I am emitting the summary entries at the end, after the >>>>>>> metadata nodes. Note that the ModuleSummaryIndex is not currently >>>>>>> referenced from the Module, and isn’t currently created when parsing the >>>>>>> Module IR bitcode (there is a separate derived class for reading the >>>>>>> ModuleSummaryIndex from bitcode). This is because they are not currently >>>>>>> used at the same time. However, in the future there is no reason why we >>>>>>> couldn’t tag the global values in the Module’s LLVM assembly with the >>>>>>> corresponding summary entry if the ModuleSummaryIndex is available when >>>>>>> printing the Module in the assembly writer. I.e. we could do the following >>>>>>> for “main” from the above example when printing the IR definition (note the >>>>>>> “^3” at the end): >>>>>>> >>>>>>> >>>>>>> I believe the reason that the ModuleSummaryIndex is not attached to >>>>>>> the Module is that it is fundamentally not a piece of IR, but it is >>>>>>> conceptually really an Analysis result. >>>>>>> Usually other analyses don't serialize their result, we happen to >>>>>>> serialize this one for an optimization purpose (reloading it and making the >>>>>>> thin-link faster). >>>>>>> >>>>>> >>>>>> True. My understanding is that the push for having it serialized via >>>>>> assembly is due to the fact that it is emitted into the bitcode. I know >>>>>> there is disagreement on this reasoning, I am hoping to have a proposal >>>>>> that is acceptable to everyone. =) >>>>>> >>>>>> >>>>>> >>>>>>> The fundamental problem is that an analysis result has to be able to >>>>>>> be invalidated with IR changes, attaching this directly to the module >>>>>>> wouldn't achieve this. The risk is that when the IR and the summary get >>>>>>> out-of-sync (`clang -O2 my_module_with_summaries.ll -emit-llvm -o >>>>>>> my_optimized module_with_summaries.ll`) the summaries would be badly wrong. >>>>>>> >>>>>>> Have you looked into what it'd take to make it a "real" analysis in >>>>>>> the pass manager? >>>>>>> >>>>>> >>>>>> Thanks for raising this issue specifically, I hadn't addressed it in >>>>>> my proposal and it is a big one. I am not proposing that we attempt to >>>>>> maintain the summary through optimization passes, and definitely don't >>>>>> think we should do that. IMO deserializing it should be for testing the >>>>>> thin link and the combined summaries in the backends only. To that end, I >>>>>> have an idea (below some background first). >>>>>> >>>>>> Note that in some cases the module summary analysis is an analysis >>>>>> pass. I.e. when invoked by "opt -module-summary=". However, some time ago >>>>>> when Peter added the support for splitting the bitcode (for CFI purposes) >>>>>> and therefore needed to generate a summary in each partition (Thin and >>>>>> Regular), he added the ThinLTOBitcodeWriterPass, which invokes the module >>>>>> summary builder directly (twice). This writer is what gets invoked now when >>>>>> building via "clang -flto=thin", and with "opt -thinlto-bc". So there it is >>>>>> not invoked/maintained as an analysis pass/result. It would be tricky to >>>>>> figure out how to even split rather than recompute the module summary index >>>>>> in that case. Even in the case where we are still invoking as an analysis >>>>>> pass (opt -module-summary), we would need to figure out how to read in the >>>>>> module summary to use as the analysis result when available (so that it >>>>>> could be invalidated and recomputed when stale). >>>>>> >>>>>> Rather than add this plumbing, and just have it discarded if opt does >>>>>> any optimization, I think we should focus at least for the time being on >>>>>> supporting reading the summary from assembly exactly where we currently >>>>>> read in the summary from bitcode: >>>>>> 1) For the thin link (e.g. tools such as llvm-lto2 or llvm-lto, which >>>>>> currently have to be preceded by "opt -module-summary/-thinlto-bc" to >>>>>> generate an index, but could just build it from assembly instead). >>>>>> 2) For the LTO backends (e.g. tools such as llvm-lto which can >>>>>> consume a combined index and invoke the backends, or "clang >>>>>> -fthinlto-index=" for distributed ThinLTO backend testing), where we could >>>>>> build the combined summary index from assembly instead. >>>>>> >>>>>> This greatly simplifies the reading side, as there are no >>>>>> optimizations performed on the IR after the index is read in these cases >>>>>> that would require invalidation. It also simplifies adding the parsing >>>>>> support, since it gets invoked exactly where we expect to build an index >>>>>> currently (i.e. since we don't currently build or store the >>>>>> ModuleSummaryIndex when parsing the Module from bitcode). It doesn't >>>>>> preclude someone from figuring out how to compute the module summary >>>>>> analysis result from the assembly, and invalidating it after optimization, >>>>>> when reading the Module IR via 'opt' in the future. >>>>>> >>>>>> Does this seem like a reasonable proposal to everyone? >>>>>> >>>>> >>>>> Sounds good to me. >>>>> >>>>> That would make .ll files quite convenient during debugging I think? >>>>> We could disassemble, manually change summaries, and re-assemble a bitcode >>>>> file before running the (thin-)link again. >>>>> >>>> >>>> Yes, that seems more reasonable than what I thought you had in mind. If >>>> the only consumer of this information is llvm-as, then the purpose of the >>>> asm summary format is just to provide a way to create a .bc file for >>>> testing purposes, which is certainly a useful capability. >>>> >>> >>> That feels a bit surprising if ".ll -> llvm-as -> .bc -> <some other >>> tool - opt, etc>" is different from ".ll -> <opt, etc>". Is that what we're >>> talking about here? Any chance that can be avoided & feeding a .ll file >>> works (in the sense of does the same thing/tests the same behavior) in all >>> the same places that feeding a .bc file does? (as is the case with >>> non-summary-based IR, to the best of my knowledge) >>> >> > I may be mistaken, but I don't think we have a lot of tools that can read > both .ll and .bc and end up using the summary if it is a .bc file. LTO > can't read .ll, for example. The only one that I can think of is clang and > presumably we could make that use whichever API we would use in llvm-as for > reading the summary from .ll. >I was thinking it would be nice to enable tools like llvm-lto2 to take a .ll file for reading the summary, or is that difficult? So the behaviour of most tools would be that ".ll -> llvm-as -> .bc ->> <some tool>" vs ".ll -> <some tool>" would end up being the same in both > cases: the summary gets discarded. >I assume you are referring to tools that don't currently need the summary? I.e. if <some tool> was llvm-lto2 we would not want the summary discarded. But 'opt' would under the current proposal, as we don't normally read a summary there and don't want to try to keep the summary in sync with the IR for all the reasons mentioned in this thread. Teresa> > > Peter > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-- Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180503/624c3a09/attachment-0001.html>