thr3ads.net - llvm dev - [llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format [Jun 2017]

If this information is useful, please help other people find it:
Share via:

Teresa Johnson via llvm-dev

2017-Jun-07 16:38 UTC

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

On Wed, Jun 7, 2017 at 8:58 AM, Charles Saternos <charles.saternos at
gmail.com> wrote:
> Alright, now it outputs YAML in the following format:
>
> ---
> NamedGlobalValueMap:
>   X:
>     - Kind:            GlobalVar
>       Linkage:         ExternalLinkage
>       NotEligibleToImport: false
>       Live:            false
>   a:
>     - Kind:            Alias
>       Linkage:         WeakAnyLinkage
>       NotEligibleToImport: false
>       Live:            false
>       AliaseeGUID:     1881667236089500162
>   afun:
>     - Kind:            Function
>       Linkage:         ExternalLinkage
>       NotEligibleToImport: false
>       Live:            false
>       InstCount:       2
>   testtest:
>     - Kind:            Function
>       Linkage:         ExternalLinkage
>       NotEligibleToImport: false
>       Live:            false
>       InstCount:       2
>       Calls:
>         - Function:        14471680721094503013
> TypeIdMap:
> WithGlobalValueDeadStripping: false
> ...
>
> Any thoughts on the new format?
>
Thanks, Charles. The main improvement I think we would want is to output
value names instead of the GUID. Can you build up a map from GUID -> name
ahead of time and use those like you were for your initial patch? Actually,
I also think it would be useful to emit both the GUID and the name, since
the combined index will eventually only have the GUID, so this would give a
mapping to use for at least the visual inspection of the combined index.

Also, would be good to see an example with FDO, to make sure the hotness
info of the calls is emitted.

Teresa

> Thanks,
> Charles
>
> On Tue, Jun 6, 2017 at 5:21 PM, Mehdi AMINI <joker.eph at gmail.com>
wrote:
>
>>
>>
>> 2017-06-06 13:38 GMT-07:00 David Blaikie <dblaikie at gmail.com>:
>>
>>>
>>>
>>> On Tue, Jun 6, 2017 at 1:26 PM Mehdi AMINI <joker.eph at
gmail.com> wrote:
>>>
>>>> 2017-06-05 14:27 GMT-07:00 David Blaikie via llvm-dev <
>>>> llvm-dev at lists.llvm.org>:
>>>>
>>>>> I know there's been a bunch of discussion here already,
but I was
>>>>> wondering if perhaps someone (probably Teresa? Peter?)
could:
>>>>>
>>>>> 1) summarize the current state
>>>>> 2) describe the end-goal
>>>>> 3) describe what steps (& how this patch relates) are
planned to get
>>>>> to (2)
>>>>>
>>>>> My naive thoughts, not being intimately familiar with any
of this:
>>>>> Usually bitcode and textual IR support go in together or
around the same
>>>>> time, and designed that way from the start (take r211920
for examaple,
>>>>> which added an explicit representation of COMDATs to the
IR). This seems to
>>>>> have been an oversight in the implementation of IR
summaries (is that an
>>>>> accurate representation/statement?)
>>>>>
>>>>
>>>> More or less: it was not an oversight.
>>>> The summaries are not really part of the IR, it is more like an
>>>> "analysis result" that is serialized. It can always
be recomputed from the
>>>> IR. This aspect makes it quite "special", it is the
only analysis result
>>>> that I know of that we serialize.
>>>>
>>>
>>> The use list work seems pretty similar in some ways (granted,
can't be
>>> recomputed to match, hence the desire to serialize it for test case
>>> implementation).
>>>
>>
>> I see use-list as a leaky implementation detail of the IR that we
>> serialized because it impact the processing of the IR.
>>
>> Summaries are more like serializing the CFG for example.
>>
>>
>>> But it looks like the same is true here to a degree - there are
test
>>> cases that exercise the summary handling, so they want summaries
for input
>>> (for now, I think, I've seen test cases that run another LLVM
tool to
>>> insert/create a summary to then feed that back in for a test), or
to test
>>> that the resulting summary is correct.
>>>
>>
>> We have cases were we want summaries as an input and check a combined
>> summary as an output, and for these having the YAML representation will
be
>> useful (we didn't have it before).
>>
>>
>>>
>>> Can summaries be standalone? I thought they could (that'd be
ideal for
>>> the distributed situation - only the summary needs to go to the
'thin link'
>>> step, I think? (currently maybe only the debug info is stripped for
that -
>>> but ideally other unused IR wouldn't be shipped there as well,
I would
>>> think)
>>>
>>
>> Yes conceptually they can be standalone.
>>
>>
>>>
>>>
>>>>
>>>>
>>>>> & now there's an effort to correct that.
>>>>>
>>>>
>>>> The main motivation here, I believe, is more to help dev to
have human
>>>> readable/understandable dump for ThinLTO bitcodes. Having to
inspect
>>>> separately summaries is a pain.
>>>>
>>>
>>> Not sure I quite follow - inspect separately?
>>>
>>
>> llvm-dis does not display summaries today, so you can't just use
llvm-dis
>> like a "regular" flow.
>>
>>
>>> How are they inspected today?
>>>
>>
>> llvm-bcanalyzer? And now the YAML dump as well.
>>
>>
>>> & also, I think there are test cases that want to/are currently
testing
>>> summary input but do so somewhat awkwardly by using another tool to
produce
>>> the summary first. Ideally the test case would have the summary
written in
>>> to start, I would think, if that's a codepath worth testing?
>>>
>>
>> The IR already contains all the information, so why repeating it? This
>> makes the test case harder to maintain, in the vast majority, I expect
that
>> if a test needs IR then it shouldn't need to include a summary as
well (and
>> vice-versa).
>>
>> In the majority of test we have we want to check if the importing does
>> what it is supposed to do, and if the linkage are correctly adjusted.
With
>> a YAML (or other) serialization for the summaries this could indeed
been
>> done purely with summaries, without any IR involved.
>>
>> --
>> Mehdi
>>
>>
>>
>>
>>
>>
>>>
>>> - Dave
>>>
>>>
>>>>
>>>>  --
>>>> Mehdi
>>>>
>>>> So it seems like that would start with a discussion of what the
right
>>>>> end-state would be: What the syntax in textual IR should
be, then
>>>>> implementing it. I can understand implementing such a thing
in steps - it's
>>>>> perhaps more involved than the COMDAT situation. In that
case starting on
>>>>> either side seems fine - implementing the emission first
(hidden behind a
>>>>> flag, so as not to break round-tripping in the interim) or
the parsing
>>>>> first (no need to hide it behind any flags - manually
written examples can
>>>>> be used as input tests).
>>>>>
>>>>> (& it sounds like there's some partially
implemented functionality
>>>>> using a YAML format that was intended to address how some
test cases could
>>>>> be written? & this might be a good basis for the syntax
- but seems to me
>>>>> like it might be a bit disjointed/out of place in the
textual IR format
>>>>> that's not otherwise YAML-based?)
>>>>>
>>>>> - Dave
>>>>>
>>>>> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos via
llvm-dev <
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> Hey all,
>>>>>>
>>>>>> Below is the proposed format for the dump of the
ThinLTO module
>>>>>> summary in the llvm-dis utility:
>>>>>>
>>>>>> > ../build/bin/llvm-dis t.o && cat t.o.ll
>>>>>> ; ModuleID = '2.o'
>>>>>> source_filename = "2.ll"
>>>>>> target datalayout =
"e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>>>>> target triple = "x86_64-unknown-linux-gnu"
>>>>>>
>>>>>> @X = constant i32 42, section "foo", align 4
>>>>>>
>>>>>> @a = weak alias i32, i32* @X
>>>>>>
>>>>>> define void @afun() {
>>>>>>   %1 = load i32, i32* @a
>>>>>>   ret void
>>>>>> }
>>>>>>
>>>>>> define void @testtest() {
>>>>>>   tail call void @boop()
>>>>>>   ret void
>>>>>> }
>>>>>>
>>>>>> declare void @boop()
>>>>>>
>>>>>> ; Module summary:
>>>>>> ;  testtest (External linkage)
>>>>>> ;    Function (2 instructions)
>>>>>> ;    Calls: boop
>>>>>> ;  X (External linkage)
>>>>>> ;    Global Variable
>>>>>> ;  afun (External linkage)
>>>>>> ;    Function (2 instructions)
>>>>>> ;    Refs:
>>>>>> ;      a
>>>>>> ;  a (Weak any linkage)
>>>>>> ;    Alias (aliasee X)
>>>>>>
>>>>>> I've implemented the above format in the llvm-dis
utility, since
>>>>>> there currently isn't really a way of getting
ThinLTO summaries in a
>>>>>> human-readable format.
>>>>>>
>>>>>> Let me know what you think of this format, and what
information you
>>>>>> think should be added/removed.
>>>>>>
>>>>>> Thanks,
>>>>>> Charles
>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>>
>>
>

-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170607/88d24e0e/attachment.html>

Charles Saternos via llvm-dev

2017-Jun-08 23:16 UTC

head link

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

Hey Teresa,

I've updated the YAML to include the names and GUIDs for all
functions/global vars/aliases. I've also added the hotness info to the
output, but for some reason, none of my tests when running with FDO gave
anything besides Unknown. I'll be looking more into this tomorrow.

Here's the current format:
> ../build/bin/llvm-lto2 dump-summary b.o---
NamedGlobalValueMap:
  :
    - GUID:            3762489268811518743
      Kind:            GlobalVar
      Linkage:         PrivateLinkage
      NotEligibleToImport: true
      Live:            false
  cold:
    - GUID:            11668175513417606517
      Kind:            Function
      Linkage:         ExternalLinkage
      NotEligibleToImport: true
      Live:            false
      InstCount:       5
      Calls:
        - Name:            puts
          GUID:            8979701042202144121
          Hotness:         Unknown
  fib:
    - GUID:            8667248078361406812
      Kind:            Function
      Linkage:         ExternalLinkage
      NotEligibleToImport: true
      Live:            false
      InstCount:       26
      Calls:
        - Name:            fib
          GUID:            8667248078361406812
          Hotness:         Unknown
  hot:
    - GUID:            10177652421713147431
      Kind:            Function
      Linkage:         ExternalLinkage
      NotEligibleToImport: true
      Live:            false
      InstCount:       14
      Calls:
        - Name:            fib
          GUID:            8667248078361406812
          Hotness:         Unknown
        - Name:            printf
          GUID:            7383291119112528047
          Hotness:         Unknown
  llvm.used:
    - GUID:            15665353970260777610
      Kind:            GlobalVar
      Linkage:         AppendingLinkage
      NotEligibleToImport: true
      Live:            true
TypeIdMap:
WithGlobalValueDeadStripping: false
...

Thanks,
Charles


On Wed, Jun 7, 2017 at 12:38 PM, Teresa Johnson <tejohnson at google.com>
wrote:
>
>
> On Wed, Jun 7, 2017 at 8:58 AM, Charles Saternos <
> charles.saternos at gmail.com> wrote:
>
>> Alright, now it outputs YAML in the following format:
>>
>> ---
>> NamedGlobalValueMap:
>>   X:
>>     - Kind:            GlobalVar
>>       Linkage:         ExternalLinkage
>>       NotEligibleToImport: false
>>       Live:            false
>>   a:
>>     - Kind:            Alias
>>       Linkage:         WeakAnyLinkage
>>       NotEligibleToImport: false
>>       Live:            false
>>       AliaseeGUID:     1881667236089500162
>>   afun:
>>     - Kind:            Function
>>       Linkage:         ExternalLinkage
>>       NotEligibleToImport: false
>>       Live:            false
>>       InstCount:       2
>>   testtest:
>>     - Kind:            Function
>>       Linkage:         ExternalLinkage
>>       NotEligibleToImport: false
>>       Live:            false
>>       InstCount:       2
>>       Calls:
>>         - Function:        14471680721094503013
>> TypeIdMap:
>> WithGlobalValueDeadStripping: false
>> ...
>>
>> Any thoughts on the new format?
>>
>
> Thanks, Charles. The main improvement I think we would want is to output
> value names instead of the GUID. Can you build up a map from GUID ->
name
> ahead of time and use those like you were for your initial patch? Actually,
> I also think it would be useful to emit both the GUID and the name, since
> the combined index will eventually only have the GUID, so this would give a
> mapping to use for at least the visual inspection of the combined index.
>
> Also, would be good to see an example with FDO, to make sure the hotness
> info of the calls is emitted.
>
> Teresa
>
>
>> Thanks,
>> Charles
>>
>> On Tue, Jun 6, 2017 at 5:21 PM, Mehdi AMINI <joker.eph at
gmail.com> wrote:
>>
>>>
>>>
>>> 2017-06-06 13:38 GMT-07:00 David Blaikie <dblaikie at
gmail.com>:
>>>
>>>>
>>>>
>>>> On Tue, Jun 6, 2017 at 1:26 PM Mehdi AMINI <joker.eph at
gmail.com> wrote:
>>>>
>>>>> 2017-06-05 14:27 GMT-07:00 David Blaikie via llvm-dev <
>>>>> llvm-dev at lists.llvm.org>:
>>>>>
>>>>>> I know there's been a bunch of discussion here
already, but I was
>>>>>> wondering if perhaps someone (probably Teresa? Peter?)
could:
>>>>>>
>>>>>> 1) summarize the current state
>>>>>> 2) describe the end-goal
>>>>>> 3) describe what steps (& how this patch relates)
are planned to get
>>>>>> to (2)
>>>>>>
>>>>>> My naive thoughts, not being intimately familiar with
any of this:
>>>>>> Usually bitcode and textual IR support go in together
or around the same
>>>>>> time, and designed that way from the start (take
r211920 for examaple,
>>>>>> which added an explicit representation of COMDATs to
the IR). This seems to
>>>>>> have been an oversight in the implementation of IR
summaries (is that an
>>>>>> accurate representation/statement?)
>>>>>>
>>>>>
>>>>> More or less: it was not an oversight.
>>>>> The summaries are not really part of the IR, it is more
like an
>>>>> "analysis result" that is serialized. It can
always be recomputed from the
>>>>> IR. This aspect makes it quite "special", it is
the only analysis result
>>>>> that I know of that we serialize.
>>>>>
>>>>
>>>> The use list work seems pretty similar in some ways (granted,
can't be
>>>> recomputed to match, hence the desire to serialize it for test
case
>>>> implementation).
>>>>
>>>
>>> I see use-list as a leaky implementation detail of the IR that we
>>> serialized because it impact the processing of the IR.
>>>
>>> Summaries are more like serializing the CFG for example.
>>>
>>>
>>>> But it looks like the same is true here to a degree - there are
test
>>>> cases that exercise the summary handling, so they want
summaries for input
>>>> (for now, I think, I've seen test cases that run another
LLVM tool to
>>>> insert/create a summary to then feed that back in for a test),
or to test
>>>> that the resulting summary is correct.
>>>>
>>>
>>> We have cases were we want summaries as an input and check a
combined
>>> summary as an output, and for these having the YAML representation
will be
>>> useful (we didn't have it before).
>>>
>>>
>>>>
>>>> Can summaries be standalone? I thought they could (that'd
be ideal for
>>>> the distributed situation - only the summary needs to go to the
'thin link'
>>>> step, I think? (currently maybe only the debug info is stripped
for that -
>>>> but ideally other unused IR wouldn't be shipped there as
well, I would
>>>> think)
>>>>
>>>
>>> Yes conceptually they can be standalone.
>>>
>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>>> & now there's an effort to correct that.
>>>>>>
>>>>>
>>>>> The main motivation here, I believe, is more to help dev to
have human
>>>>> readable/understandable dump for ThinLTO bitcodes. Having
to inspect
>>>>> separately summaries is a pain.
>>>>>
>>>>
>>>> Not sure I quite follow - inspect separately?
>>>>
>>>
>>> llvm-dis does not display summaries today, so you can't just
use
>>> llvm-dis like a "regular" flow.
>>>
>>>
>>>> How are they inspected today?
>>>>
>>>
>>> llvm-bcanalyzer? And now the YAML dump as well.
>>>
>>>
>>>> & also, I think there are test cases that want to/are
currently testing
>>>> summary input but do so somewhat awkwardly by using another
tool to produce
>>>> the summary first. Ideally the test case would have the summary
written in
>>>> to start, I would think, if that's a codepath worth
testing?
>>>>
>>>
>>> The IR already contains all the information, so why repeating it?
This
>>> makes the test case harder to maintain, in the vast majority, I
expect that
>>> if a test needs IR then it shouldn't need to include a summary
as well (and
>>> vice-versa).
>>>
>>> In the majority of test we have we want to check if the importing
does
>>> what it is supposed to do, and if the linkage are correctly
adjusted. With
>>> a YAML (or other) serialization for the summaries this could indeed
been
>>> done purely with summaries, without any IR involved.
>>>
>>> --
>>> Mehdi
>>>
>>>
>>>
>>>
>>>
>>>
>>>>
>>>> - Dave
>>>>
>>>>
>>>>>
>>>>>  --
>>>>> Mehdi
>>>>>
>>>>> So it seems like that would start with a discussion of what
the right
>>>>>> end-state would be: What the syntax in textual IR
should be, then
>>>>>> implementing it. I can understand implementing such a
thing in steps - it's
>>>>>> perhaps more involved than the COMDAT situation. In
that case starting on
>>>>>> either side seems fine - implementing the emission
first (hidden behind a
>>>>>> flag, so as not to break round-tripping in the interim)
or the parsing
>>>>>> first (no need to hide it behind any flags - manually
written examples can
>>>>>> be used as input tests).
>>>>>>
>>>>>> (& it sounds like there's some partially
implemented functionality
>>>>>> using a YAML format that was intended to address how
some test cases could
>>>>>> be written? & this might be a good basis for the
syntax - but seems to me
>>>>>> like it might be a bit disjointed/out of place in the
textual IR format
>>>>>> that's not otherwise YAML-based?)
>>>>>>
>>>>>> - Dave
>>>>>>
>>>>>> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos via
llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> Hey all,
>>>>>>>
>>>>>>> Below is the proposed format for the dump of the
ThinLTO module
>>>>>>> summary in the llvm-dis utility:
>>>>>>>
>>>>>>> > ../build/bin/llvm-dis t.o && cat
t.o.ll
>>>>>>> ; ModuleID = '2.o'
>>>>>>> source_filename = "2.ll"
>>>>>>> target datalayout =
"e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>>>>>> target triple =
"x86_64-unknown-linux-gnu"
>>>>>>>
>>>>>>> @X = constant i32 42, section "foo",
align 4
>>>>>>>
>>>>>>> @a = weak alias i32, i32* @X
>>>>>>>
>>>>>>> define void @afun() {
>>>>>>>   %1 = load i32, i32* @a
>>>>>>>   ret void
>>>>>>> }
>>>>>>>
>>>>>>> define void @testtest() {
>>>>>>>   tail call void @boop()
>>>>>>>   ret void
>>>>>>> }
>>>>>>>
>>>>>>> declare void @boop()
>>>>>>>
>>>>>>> ; Module summary:
>>>>>>> ;  testtest (External linkage)
>>>>>>> ;    Function (2 instructions)
>>>>>>> ;    Calls: boop
>>>>>>> ;  X (External linkage)
>>>>>>> ;    Global Variable
>>>>>>> ;  afun (External linkage)
>>>>>>> ;    Function (2 instructions)
>>>>>>> ;    Refs:
>>>>>>> ;      a
>>>>>>> ;  a (Weak any linkage)
>>>>>>> ;    Alias (aliasee X)
>>>>>>>
>>>>>>> I've implemented the above format in the
llvm-dis utility, since
>>>>>>> there currently isn't really a way of getting
ThinLTO summaries in a
>>>>>>> human-readable format.
>>>>>>>
>>>>>>> Let me know what you think of this format, and what
information you
>>>>>>> think should be added/removed.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Charles
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>>>
>>>
>>
>
>
> --
> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
> 408-460-2413 <(408)%20460-2413>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170608/6adf704b/attachment.html>

Teresa Johnson via llvm-dev

2017-Jun-08 23:55 UTC

head link

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

Great! For the hotness, try creating a small test case with a very hot loop
that iterates many times. Let me know if you are still having trouble.
While the llvm-dis serialization is being discussed, I suppose at the very
least this can go in with the rest of the existing YAML summary dumping and
get emitted from llvm-lto2 using the patch Peter attached. Peter - do you
want to add that to llvm-lto2 so that we have a way of dumping the existing
YAML supported summary info to stdout, or would you rather have Charles
take that one over and submit it (probably just needs a test case).

Teresa

On Thu, Jun 8, 2017 at 4:16 PM, Charles Saternos <charles.saternos at
gmail.com> wrote:
> Hey Teresa,
>
> I've updated the YAML to include the names and GUIDs for all
> functions/global vars/aliases. I've also added the hotness info to the
> output, but for some reason, none of my tests when running with FDO gave
> anything besides Unknown. I'll be looking more into this tomorrow.
>
> Here's the current format:
>
> > ../build/bin/llvm-lto2 dump-summary b.o
> ---
> NamedGlobalValueMap:
>   :
>     - GUID:            3762489268811518743
>       Kind:            GlobalVar
>       Linkage:         PrivateLinkage
>       NotEligibleToImport: true
>       Live:            false
>   cold:
>     - GUID:            11668175513417606517
>       Kind:            Function
>       Linkage:         ExternalLinkage
>       NotEligibleToImport: true
>       Live:            false
>       InstCount:       5
>       Calls:
>         - Name:            puts
>           GUID:            8979701042202144121
>           Hotness:         Unknown
>   fib:
>     - GUID:            8667248078361406812
>       Kind:            Function
>       Linkage:         ExternalLinkage
>       NotEligibleToImport: true
>       Live:            false
>       InstCount:       26
>       Calls:
>         - Name:            fib
>           GUID:            8667248078361406812
>           Hotness:         Unknown
>   hot:
>     - GUID:            10177652421713147431
>       Kind:            Function
>       Linkage:         ExternalLinkage
>       NotEligibleToImport: true
>       Live:            false
>       InstCount:       14
>       Calls:
>         - Name:            fib
>           GUID:            8667248078361406812
>           Hotness:         Unknown
>         - Name:            printf
>           GUID:            7383291119112528047
>           Hotness:         Unknown
>   llvm.used:
>     - GUID:            15665353970260777610
>       Kind:            GlobalVar
>       Linkage:         AppendingLinkage
>       NotEligibleToImport: true
>       Live:            true
> TypeIdMap:
> WithGlobalValueDeadStripping: false
> ...
>
> Thanks,
> Charles
>
>
> On Wed, Jun 7, 2017 at 12:38 PM, Teresa Johnson <tejohnson at
google.com>
> wrote:
>
>>
>>
>> On Wed, Jun 7, 2017 at 8:58 AM, Charles Saternos <
>> charles.saternos at gmail.com> wrote:
>>
>>> Alright, now it outputs YAML in the following format:
>>>
>>> ---
>>> NamedGlobalValueMap:
>>>   X:
>>>     - Kind:            GlobalVar
>>>       Linkage:         ExternalLinkage
>>>       NotEligibleToImport: false
>>>       Live:            false
>>>   a:
>>>     - Kind:            Alias
>>>       Linkage:         WeakAnyLinkage
>>>       NotEligibleToImport: false
>>>       Live:            false
>>>       AliaseeGUID:     1881667236089500162
>>>   afun:
>>>     - Kind:            Function
>>>       Linkage:         ExternalLinkage
>>>       NotEligibleToImport: false
>>>       Live:            false
>>>       InstCount:       2
>>>   testtest:
>>>     - Kind:            Function
>>>       Linkage:         ExternalLinkage
>>>       NotEligibleToImport: false
>>>       Live:            false
>>>       InstCount:       2
>>>       Calls:
>>>         - Function:        14471680721094503013
>>> TypeIdMap:
>>> WithGlobalValueDeadStripping: false
>>> ...
>>>
>>> Any thoughts on the new format?
>>>
>>
>> Thanks, Charles. The main improvement I think we would want is to
output
>> value names instead of the GUID. Can you build up a map from GUID ->
name
>> ahead of time and use those like you were for your initial patch?
Actually,
>> I also think it would be useful to emit both the GUID and the name,
since
>> the combined index will eventually only have the GUID, so this would
give a
>> mapping to use for at least the visual inspection of the combined
index.
>>
>> Also, would be good to see an example with FDO, to make sure the
hotness
>> info of the calls is emitted.
>>
>> Teresa
>>
>>
>>> Thanks,
>>> Charles
>>>
>>> On Tue, Jun 6, 2017 at 5:21 PM, Mehdi AMINI <joker.eph at
gmail.com> wrote:
>>>
>>>>
>>>>
>>>> 2017-06-06 13:38 GMT-07:00 David Blaikie <dblaikie at
gmail.com>:
>>>>
>>>>>
>>>>>
>>>>> On Tue, Jun 6, 2017 at 1:26 PM Mehdi AMINI <joker.eph at
gmail.com>
>>>>> wrote:
>>>>>
>>>>>> 2017-06-05 14:27 GMT-07:00 David Blaikie via llvm-dev
<
>>>>>> llvm-dev at lists.llvm.org>:
>>>>>>
>>>>>>> I know there's been a bunch of discussion here
already, but I was
>>>>>>> wondering if perhaps someone (probably Teresa?
Peter?) could:
>>>>>>>
>>>>>>> 1) summarize the current state
>>>>>>> 2) describe the end-goal
>>>>>>> 3) describe what steps (& how this patch
relates) are planned to get
>>>>>>> to (2)
>>>>>>>
>>>>>>> My naive thoughts, not being intimately familiar
with any of this:
>>>>>>> Usually bitcode and textual IR support go in
together or around the same
>>>>>>> time, and designed that way from the start (take
r211920 for examaple,
>>>>>>> which added an explicit representation of COMDATs
to the IR). This seems to
>>>>>>> have been an oversight in the implementation of IR
summaries (is that an
>>>>>>> accurate representation/statement?)
>>>>>>>
>>>>>>
>>>>>> More or less: it was not an oversight.
>>>>>> The summaries are not really part of the IR, it is more
like an
>>>>>> "analysis result" that is serialized. It can
always be recomputed from the
>>>>>> IR. This aspect makes it quite "special", it
is the only analysis result
>>>>>> that I know of that we serialize.
>>>>>>
>>>>>
>>>>> The use list work seems pretty similar in some ways
(granted, can't be
>>>>> recomputed to match, hence the desire to serialize it for
test case
>>>>> implementation).
>>>>>
>>>>
>>>> I see use-list as a leaky implementation detail of the IR that
we
>>>> serialized because it impact the processing of the IR.
>>>>
>>>> Summaries are more like serializing the CFG for example.
>>>>
>>>>
>>>>> But it looks like the same is true here to a degree - there
are test
>>>>> cases that exercise the summary handling, so they want
summaries for input
>>>>> (for now, I think, I've seen test cases that run
another LLVM tool to
>>>>> insert/create a summary to then feed that back in for a
test), or to test
>>>>> that the resulting summary is correct.
>>>>>
>>>>
>>>> We have cases were we want summaries as an input and check a
combined
>>>> summary as an output, and for these having the YAML
representation will be
>>>> useful (we didn't have it before).
>>>>
>>>>
>>>>>
>>>>> Can summaries be standalone? I thought they could
(that'd be ideal for
>>>>> the distributed situation - only the summary needs to go to
the 'thin link'
>>>>> step, I think? (currently maybe only the debug info is
stripped for that -
>>>>> but ideally other unused IR wouldn't be shipped there
as well, I would
>>>>> think)
>>>>>
>>>>
>>>> Yes conceptually they can be standalone.
>>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>> & now there's an effort to correct that.
>>>>>>>
>>>>>>
>>>>>> The main motivation here, I believe, is more to help
dev to have
>>>>>> human readable/understandable dump for ThinLTO
bitcodes. Having to inspect
>>>>>> separately summaries is a pain.
>>>>>>
>>>>>
>>>>> Not sure I quite follow - inspect separately?
>>>>>
>>>>
>>>> llvm-dis does not display summaries today, so you can't
just use
>>>> llvm-dis like a "regular" flow.
>>>>
>>>>
>>>>> How are they inspected today?
>>>>>
>>>>
>>>> llvm-bcanalyzer? And now the YAML dump as well.
>>>>
>>>>
>>>>> & also, I think there are test cases that want to/are
currently
>>>>> testing summary input but do so somewhat awkwardly by using
another tool to
>>>>> produce the summary first. Ideally the test case would have
the summary
>>>>> written in to start, I would think, if that's a
codepath worth testing?
>>>>>
>>>>
>>>> The IR already contains all the information, so why repeating
it? This
>>>> makes the test case harder to maintain, in the vast majority, I
expect that
>>>> if a test needs IR then it shouldn't need to include a
summary as well (and
>>>> vice-versa).
>>>>
>>>> In the majority of test we have we want to check if the
importing does
>>>> what it is supposed to do, and if the linkage are correctly
adjusted. With
>>>> a YAML (or other) serialization for the summaries this could
indeed been
>>>> done purely with summaries, without any IR involved.
>>>>
>>>> --
>>>> Mehdi
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> - Dave
>>>>>
>>>>>
>>>>>>
>>>>>>  --
>>>>>> Mehdi
>>>>>>
>>>>>> So it seems like that would start with a discussion of
what the right
>>>>>>> end-state would be: What the syntax in textual IR
should be, then
>>>>>>> implementing it. I can understand implementing such
a thing in steps - it's
>>>>>>> perhaps more involved than the COMDAT situation. In
that case starting on
>>>>>>> either side seems fine - implementing the emission
first (hidden behind a
>>>>>>> flag, so as not to break round-tripping in the
interim) or the parsing
>>>>>>> first (no need to hide it behind any flags -
manually written examples can
>>>>>>> be used as input tests).
>>>>>>>
>>>>>>> (& it sounds like there's some partially
implemented functionality
>>>>>>> using a YAML format that was intended to address
how some test cases could
>>>>>>> be written? & this might be a good basis for
the syntax - but seems to me
>>>>>>> like it might be a bit disjointed/out of place in
the textual IR format
>>>>>>> that's not otherwise YAML-based?)
>>>>>>>
>>>>>>> - Dave
>>>>>>>
>>>>>>> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos via
llvm-dev <
>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>> Hey all,
>>>>>>>>
>>>>>>>> Below is the proposed format for the dump of
the ThinLTO module
>>>>>>>> summary in the llvm-dis utility:
>>>>>>>>
>>>>>>>> > ../build/bin/llvm-dis t.o && cat
t.o.ll
>>>>>>>> ; ModuleID = '2.o'
>>>>>>>> source_filename = "2.ll"
>>>>>>>> target datalayout =
"e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>>>>>>> target triple =
"x86_64-unknown-linux-gnu"
>>>>>>>>
>>>>>>>> @X = constant i32 42, section "foo",
align 4
>>>>>>>>
>>>>>>>> @a = weak alias i32, i32* @X
>>>>>>>>
>>>>>>>> define void @afun() {
>>>>>>>>   %1 = load i32, i32* @a
>>>>>>>>   ret void
>>>>>>>> }
>>>>>>>>
>>>>>>>> define void @testtest() {
>>>>>>>>   tail call void @boop()
>>>>>>>>   ret void
>>>>>>>> }
>>>>>>>>
>>>>>>>> declare void @boop()
>>>>>>>>
>>>>>>>> ; Module summary:
>>>>>>>> ;  testtest (External linkage)
>>>>>>>> ;    Function (2 instructions)
>>>>>>>> ;    Calls: boop
>>>>>>>> ;  X (External linkage)
>>>>>>>> ;    Global Variable
>>>>>>>> ;  afun (External linkage)
>>>>>>>> ;    Function (2 instructions)
>>>>>>>> ;    Refs:
>>>>>>>> ;      a
>>>>>>>> ;  a (Weak any linkage)
>>>>>>>> ;    Alias (aliasee X)
>>>>>>>>
>>>>>>>> I've implemented the above format in the
llvm-dis utility, since
>>>>>>>> there currently isn't really a way of
getting ThinLTO summaries in a
>>>>>>>> human-readable format.
>>>>>>>>
>>>>>>>> Let me know what you think of this format, and
what information you
>>>>>>>> think should be added/removed.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Charles
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>>
>>>>
>>>
>>
>>
>> --
>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>> 408-460-2413 <(408)%20460-2413>
>>
>
>

-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170608/da136b53/attachment.html>

llvm dev - Jun 2017 - [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format