thr3ads.net - llvm dev - [llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format [Jun 2017]

If this information is useful, please help other people find it:
Share via:

Teresa Johnson via llvm-dev

2017-Jun-08 23:55 UTC

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

Great! For the hotness, try creating a small test case with a very hot loop
that iterates many times. Let me know if you are still having trouble.
While the llvm-dis serialization is being discussed, I suppose at the very
least this can go in with the rest of the existing YAML summary dumping and
get emitted from llvm-lto2 using the patch Peter attached. Peter - do you
want to add that to llvm-lto2 so that we have a way of dumping the existing
YAML supported summary info to stdout, or would you rather have Charles
take that one over and submit it (probably just needs a test case).

Teresa

On Thu, Jun 8, 2017 at 4:16 PM, Charles Saternos <charles.saternos at
gmail.com> wrote:
> Hey Teresa,
>
> I've updated the YAML to include the names and GUIDs for all
> functions/global vars/aliases. I've also added the hotness info to the
> output, but for some reason, none of my tests when running with FDO gave
> anything besides Unknown. I'll be looking more into this tomorrow.
>
> Here's the current format:
>
> > ../build/bin/llvm-lto2 dump-summary b.o
> ---
> NamedGlobalValueMap:
>   :
>     - GUID:            3762489268811518743
>       Kind:            GlobalVar
>       Linkage:         PrivateLinkage
>       NotEligibleToImport: true
>       Live:            false
>   cold:
>     - GUID:            11668175513417606517
>       Kind:            Function
>       Linkage:         ExternalLinkage
>       NotEligibleToImport: true
>       Live:            false
>       InstCount:       5
>       Calls:
>         - Name:            puts
>           GUID:            8979701042202144121
>           Hotness:         Unknown
>   fib:
>     - GUID:            8667248078361406812
>       Kind:            Function
>       Linkage:         ExternalLinkage
>       NotEligibleToImport: true
>       Live:            false
>       InstCount:       26
>       Calls:
>         - Name:            fib
>           GUID:            8667248078361406812
>           Hotness:         Unknown
>   hot:
>     - GUID:            10177652421713147431
>       Kind:            Function
>       Linkage:         ExternalLinkage
>       NotEligibleToImport: true
>       Live:            false
>       InstCount:       14
>       Calls:
>         - Name:            fib
>           GUID:            8667248078361406812
>           Hotness:         Unknown
>         - Name:            printf
>           GUID:            7383291119112528047
>           Hotness:         Unknown
>   llvm.used:
>     - GUID:            15665353970260777610
>       Kind:            GlobalVar
>       Linkage:         AppendingLinkage
>       NotEligibleToImport: true
>       Live:            true
> TypeIdMap:
> WithGlobalValueDeadStripping: false
> ...
>
> Thanks,
> Charles
>
>
> On Wed, Jun 7, 2017 at 12:38 PM, Teresa Johnson <tejohnson at
google.com>
> wrote:
>
>>
>>
>> On Wed, Jun 7, 2017 at 8:58 AM, Charles Saternos <
>> charles.saternos at gmail.com> wrote:
>>
>>> Alright, now it outputs YAML in the following format:
>>>
>>> ---
>>> NamedGlobalValueMap:
>>>   X:
>>>     - Kind:            GlobalVar
>>>       Linkage:         ExternalLinkage
>>>       NotEligibleToImport: false
>>>       Live:            false
>>>   a:
>>>     - Kind:            Alias
>>>       Linkage:         WeakAnyLinkage
>>>       NotEligibleToImport: false
>>>       Live:            false
>>>       AliaseeGUID:     1881667236089500162
>>>   afun:
>>>     - Kind:            Function
>>>       Linkage:         ExternalLinkage
>>>       NotEligibleToImport: false
>>>       Live:            false
>>>       InstCount:       2
>>>   testtest:
>>>     - Kind:            Function
>>>       Linkage:         ExternalLinkage
>>>       NotEligibleToImport: false
>>>       Live:            false
>>>       InstCount:       2
>>>       Calls:
>>>         - Function:        14471680721094503013
>>> TypeIdMap:
>>> WithGlobalValueDeadStripping: false
>>> ...
>>>
>>> Any thoughts on the new format?
>>>
>>
>> Thanks, Charles. The main improvement I think we would want is to
output
>> value names instead of the GUID. Can you build up a map from GUID ->
name
>> ahead of time and use those like you were for your initial patch?
Actually,
>> I also think it would be useful to emit both the GUID and the name,
since
>> the combined index will eventually only have the GUID, so this would
give a
>> mapping to use for at least the visual inspection of the combined
index.
>>
>> Also, would be good to see an example with FDO, to make sure the
hotness
>> info of the calls is emitted.
>>
>> Teresa
>>
>>
>>> Thanks,
>>> Charles
>>>
>>> On Tue, Jun 6, 2017 at 5:21 PM, Mehdi AMINI <joker.eph at
gmail.com> wrote:
>>>
>>>>
>>>>
>>>> 2017-06-06 13:38 GMT-07:00 David Blaikie <dblaikie at
gmail.com>:
>>>>
>>>>>
>>>>>
>>>>> On Tue, Jun 6, 2017 at 1:26 PM Mehdi AMINI <joker.eph at
gmail.com>
>>>>> wrote:
>>>>>
>>>>>> 2017-06-05 14:27 GMT-07:00 David Blaikie via llvm-dev
<
>>>>>> llvm-dev at lists.llvm.org>:
>>>>>>
>>>>>>> I know there's been a bunch of discussion here
already, but I was
>>>>>>> wondering if perhaps someone (probably Teresa?
Peter?) could:
>>>>>>>
>>>>>>> 1) summarize the current state
>>>>>>> 2) describe the end-goal
>>>>>>> 3) describe what steps (& how this patch
relates) are planned to get
>>>>>>> to (2)
>>>>>>>
>>>>>>> My naive thoughts, not being intimately familiar
with any of this:
>>>>>>> Usually bitcode and textual IR support go in
together or around the same
>>>>>>> time, and designed that way from the start (take
r211920 for examaple,
>>>>>>> which added an explicit representation of COMDATs
to the IR). This seems to
>>>>>>> have been an oversight in the implementation of IR
summaries (is that an
>>>>>>> accurate representation/statement?)
>>>>>>>
>>>>>>
>>>>>> More or less: it was not an oversight.
>>>>>> The summaries are not really part of the IR, it is more
like an
>>>>>> "analysis result" that is serialized. It can
always be recomputed from the
>>>>>> IR. This aspect makes it quite "special", it
is the only analysis result
>>>>>> that I know of that we serialize.
>>>>>>
>>>>>
>>>>> The use list work seems pretty similar in some ways
(granted, can't be
>>>>> recomputed to match, hence the desire to serialize it for
test case
>>>>> implementation).
>>>>>
>>>>
>>>> I see use-list as a leaky implementation detail of the IR that
we
>>>> serialized because it impact the processing of the IR.
>>>>
>>>> Summaries are more like serializing the CFG for example.
>>>>
>>>>
>>>>> But it looks like the same is true here to a degree - there
are test
>>>>> cases that exercise the summary handling, so they want
summaries for input
>>>>> (for now, I think, I've seen test cases that run
another LLVM tool to
>>>>> insert/create a summary to then feed that back in for a
test), or to test
>>>>> that the resulting summary is correct.
>>>>>
>>>>
>>>> We have cases were we want summaries as an input and check a
combined
>>>> summary as an output, and for these having the YAML
representation will be
>>>> useful (we didn't have it before).
>>>>
>>>>
>>>>>
>>>>> Can summaries be standalone? I thought they could
(that'd be ideal for
>>>>> the distributed situation - only the summary needs to go to
the 'thin link'
>>>>> step, I think? (currently maybe only the debug info is
stripped for that -
>>>>> but ideally other unused IR wouldn't be shipped there
as well, I would
>>>>> think)
>>>>>
>>>>
>>>> Yes conceptually they can be standalone.
>>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>> & now there's an effort to correct that.
>>>>>>>
>>>>>>
>>>>>> The main motivation here, I believe, is more to help
dev to have
>>>>>> human readable/understandable dump for ThinLTO
bitcodes. Having to inspect
>>>>>> separately summaries is a pain.
>>>>>>
>>>>>
>>>>> Not sure I quite follow - inspect separately?
>>>>>
>>>>
>>>> llvm-dis does not display summaries today, so you can't
just use
>>>> llvm-dis like a "regular" flow.
>>>>
>>>>
>>>>> How are they inspected today?
>>>>>
>>>>
>>>> llvm-bcanalyzer? And now the YAML dump as well.
>>>>
>>>>
>>>>> & also, I think there are test cases that want to/are
currently
>>>>> testing summary input but do so somewhat awkwardly by using
another tool to
>>>>> produce the summary first. Ideally the test case would have
the summary
>>>>> written in to start, I would think, if that's a
codepath worth testing?
>>>>>
>>>>
>>>> The IR already contains all the information, so why repeating
it? This
>>>> makes the test case harder to maintain, in the vast majority, I
expect that
>>>> if a test needs IR then it shouldn't need to include a
summary as well (and
>>>> vice-versa).
>>>>
>>>> In the majority of test we have we want to check if the
importing does
>>>> what it is supposed to do, and if the linkage are correctly
adjusted. With
>>>> a YAML (or other) serialization for the summaries this could
indeed been
>>>> done purely with summaries, without any IR involved.
>>>>
>>>> --
>>>> Mehdi
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> - Dave
>>>>>
>>>>>
>>>>>>
>>>>>>  --
>>>>>> Mehdi
>>>>>>
>>>>>> So it seems like that would start with a discussion of
what the right
>>>>>>> end-state would be: What the syntax in textual IR
should be, then
>>>>>>> implementing it. I can understand implementing such
a thing in steps - it's
>>>>>>> perhaps more involved than the COMDAT situation. In
that case starting on
>>>>>>> either side seems fine - implementing the emission
first (hidden behind a
>>>>>>> flag, so as not to break round-tripping in the
interim) or the parsing
>>>>>>> first (no need to hide it behind any flags -
manually written examples can
>>>>>>> be used as input tests).
>>>>>>>
>>>>>>> (& it sounds like there's some partially
implemented functionality
>>>>>>> using a YAML format that was intended to address
how some test cases could
>>>>>>> be written? & this might be a good basis for
the syntax - but seems to me
>>>>>>> like it might be a bit disjointed/out of place in
the textual IR format
>>>>>>> that's not otherwise YAML-based?)
>>>>>>>
>>>>>>> - Dave
>>>>>>>
>>>>>>> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos via
llvm-dev <
>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>> Hey all,
>>>>>>>>
>>>>>>>> Below is the proposed format for the dump of
the ThinLTO module
>>>>>>>> summary in the llvm-dis utility:
>>>>>>>>
>>>>>>>> > ../build/bin/llvm-dis t.o && cat
t.o.ll
>>>>>>>> ; ModuleID = '2.o'
>>>>>>>> source_filename = "2.ll"
>>>>>>>> target datalayout =
"e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>>>>>>> target triple =
"x86_64-unknown-linux-gnu"
>>>>>>>>
>>>>>>>> @X = constant i32 42, section "foo",
align 4
>>>>>>>>
>>>>>>>> @a = weak alias i32, i32* @X
>>>>>>>>
>>>>>>>> define void @afun() {
>>>>>>>>   %1 = load i32, i32* @a
>>>>>>>>   ret void
>>>>>>>> }
>>>>>>>>
>>>>>>>> define void @testtest() {
>>>>>>>>   tail call void @boop()
>>>>>>>>   ret void
>>>>>>>> }
>>>>>>>>
>>>>>>>> declare void @boop()
>>>>>>>>
>>>>>>>> ; Module summary:
>>>>>>>> ;  testtest (External linkage)
>>>>>>>> ;    Function (2 instructions)
>>>>>>>> ;    Calls: boop
>>>>>>>> ;  X (External linkage)
>>>>>>>> ;    Global Variable
>>>>>>>> ;  afun (External linkage)
>>>>>>>> ;    Function (2 instructions)
>>>>>>>> ;    Refs:
>>>>>>>> ;      a
>>>>>>>> ;  a (Weak any linkage)
>>>>>>>> ;    Alias (aliasee X)
>>>>>>>>
>>>>>>>> I've implemented the above format in the
llvm-dis utility, since
>>>>>>>> there currently isn't really a way of
getting ThinLTO summaries in a
>>>>>>>> human-readable format.
>>>>>>>>
>>>>>>>> Let me know what you think of this format, and
what information you
>>>>>>>> think should be added/removed.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Charles
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>>
>>>>
>>>
>>
>>
>> --
>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>> 408-460-2413 <(408)%20460-2413>
>>
>
>

-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170608/da136b53/attachment.html>

Peter Collingbourne via llvm-dev

2017-Jun-09 00:01 UTC

head link

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

I'd ask Charles to take it over. I think it just needs a test case and an
update to the usage message.

Peter

On Thu, Jun 8, 2017 at 4:55 PM, Teresa Johnson via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Great! For the hotness, try creating a small test case with a very hot
> loop that iterates many times. Let me know if you are still having trouble.
> While the llvm-dis serialization is being discussed, I suppose at the very
> least this can go in with the rest of the existing YAML summary dumping and
> get emitted from llvm-lto2 using the patch Peter attached. Peter - do you
> want to add that to llvm-lto2 so that we have a way of dumping the existing
> YAML supported summary info to stdout, or would you rather have Charles
> take that one over and submit it (probably just needs a test case).
>
> Teresa
>
> On Thu, Jun 8, 2017 at 4:16 PM, Charles Saternos <
> charles.saternos at gmail.com> wrote:
>
>> Hey Teresa,
>>
>> I've updated the YAML to include the names and GUIDs for all
>> functions/global vars/aliases. I've also added the hotness info to
the
>> output, but for some reason, none of my tests when running with FDO
gave
>> anything besides Unknown. I'll be looking more into this tomorrow.
>>
>> Here's the current format:
>>
>> > ../build/bin/llvm-lto2 dump-summary b.o
>> ---
>> NamedGlobalValueMap:
>>   :
>>     - GUID:            3762489268811518743
>>       Kind:            GlobalVar
>>       Linkage:         PrivateLinkage
>>       NotEligibleToImport: true
>>       Live:            false
>>   cold:
>>     - GUID:            11668175513417606517
>>       Kind:            Function
>>       Linkage:         ExternalLinkage
>>       NotEligibleToImport: true
>>       Live:            false
>>       InstCount:       5
>>       Calls:
>>         - Name:            puts
>>           GUID:            8979701042202144121
>>           Hotness:         Unknown
>>   fib:
>>     - GUID:            8667248078361406812
>>       Kind:            Function
>>       Linkage:         ExternalLinkage
>>       NotEligibleToImport: true
>>       Live:            false
>>       InstCount:       26
>>       Calls:
>>         - Name:            fib
>>           GUID:            8667248078361406812
>>           Hotness:         Unknown
>>   hot:
>>     - GUID:            10177652421713147431
>>       Kind:            Function
>>       Linkage:         ExternalLinkage
>>       NotEligibleToImport: true
>>       Live:            false
>>       InstCount:       14
>>       Calls:
>>         - Name:            fib
>>           GUID:            8667248078361406812
>>           Hotness:         Unknown
>>         - Name:            printf
>>           GUID:            7383291119112528047
>>           Hotness:         Unknown
>>   llvm.used:
>>     - GUID:            15665353970260777610
>>       Kind:            GlobalVar
>>       Linkage:         AppendingLinkage
>>       NotEligibleToImport: true
>>       Live:            true
>> TypeIdMap:
>> WithGlobalValueDeadStripping: false
>> ...
>>
>> Thanks,
>> Charles
>>
>>
>> On Wed, Jun 7, 2017 at 12:38 PM, Teresa Johnson <tejohnson at
google.com>
>> wrote:
>>
>>>
>>>
>>> On Wed, Jun 7, 2017 at 8:58 AM, Charles Saternos <
>>> charles.saternos at gmail.com> wrote:
>>>
>>>> Alright, now it outputs YAML in the following format:
>>>>
>>>> ---
>>>> NamedGlobalValueMap:
>>>>   X:
>>>>     - Kind:            GlobalVar
>>>>       Linkage:         ExternalLinkage
>>>>       NotEligibleToImport: false
>>>>       Live:            false
>>>>   a:
>>>>     - Kind:            Alias
>>>>       Linkage:         WeakAnyLinkage
>>>>       NotEligibleToImport: false
>>>>       Live:            false
>>>>       AliaseeGUID:     1881667236089500162
>>>>   afun:
>>>>     - Kind:            Function
>>>>       Linkage:         ExternalLinkage
>>>>       NotEligibleToImport: false
>>>>       Live:            false
>>>>       InstCount:       2
>>>>   testtest:
>>>>     - Kind:            Function
>>>>       Linkage:         ExternalLinkage
>>>>       NotEligibleToImport: false
>>>>       Live:            false
>>>>       InstCount:       2
>>>>       Calls:
>>>>         - Function:        14471680721094503013
>>>> TypeIdMap:
>>>> WithGlobalValueDeadStripping: false
>>>> ...
>>>>
>>>> Any thoughts on the new format?
>>>>
>>>
>>> Thanks, Charles. The main improvement I think we would want is to
output
>>> value names instead of the GUID. Can you build up a map from GUID
-> name
>>> ahead of time and use those like you were for your initial patch?
Actually,
>>> I also think it would be useful to emit both the GUID and the name,
since
>>> the combined index will eventually only have the GUID, so this
would give a
>>> mapping to use for at least the visual inspection of the combined
index.
>>>
>>> Also, would be good to see an example with FDO, to make sure the
hotness
>>> info of the calls is emitted.
>>>
>>> Teresa
>>>
>>>
>>>> Thanks,
>>>> Charles
>>>>
>>>> On Tue, Jun 6, 2017 at 5:21 PM, Mehdi AMINI <joker.eph at
gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> 2017-06-06 13:38 GMT-07:00 David Blaikie <dblaikie at
gmail.com>:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jun 6, 2017 at 1:26 PM Mehdi AMINI
<joker.eph at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> 2017-06-05 14:27 GMT-07:00 David Blaikie via
llvm-dev <
>>>>>>> llvm-dev at lists.llvm.org>:
>>>>>>>
>>>>>>>> I know there's been a bunch of discussion
here already, but I was
>>>>>>>> wondering if perhaps someone (probably Teresa?
Peter?) could:
>>>>>>>>
>>>>>>>> 1) summarize the current state
>>>>>>>> 2) describe the end-goal
>>>>>>>> 3) describe what steps (& how this patch
relates) are planned to
>>>>>>>> get to (2)
>>>>>>>>
>>>>>>>> My naive thoughts, not being intimately
familiar with any of this:
>>>>>>>> Usually bitcode and textual IR support go in
together or around the same
>>>>>>>> time, and designed that way from the start
(take r211920 for examaple,
>>>>>>>> which added an explicit representation of
COMDATs to the IR). This seems to
>>>>>>>> have been an oversight in the implementation of
IR summaries (is that an
>>>>>>>> accurate representation/statement?)
>>>>>>>>
>>>>>>>
>>>>>>> More or less: it was not an oversight.
>>>>>>> The summaries are not really part of the IR, it is
more like an
>>>>>>> "analysis result" that is serialized. It
can always be recomputed from the
>>>>>>> IR. This aspect makes it quite "special",
it is the only analysis result
>>>>>>> that I know of that we serialize.
>>>>>>>
>>>>>>
>>>>>> The use list work seems pretty similar in some ways
(granted, can't
>>>>>> be recomputed to match, hence the desire to serialize
it for test case
>>>>>> implementation).
>>>>>>
>>>>>
>>>>> I see use-list as a leaky implementation detail of the IR
that we
>>>>> serialized because it impact the processing of the IR.
>>>>>
>>>>> Summaries are more like serializing the CFG for example.
>>>>>
>>>>>
>>>>>> But it looks like the same is true here to a degree -
there are test
>>>>>> cases that exercise the summary handling, so they want
summaries for input
>>>>>> (for now, I think, I've seen test cases that run
another LLVM tool to
>>>>>> insert/create a summary to then feed that back in for a
test), or to test
>>>>>> that the resulting summary is correct.
>>>>>>
>>>>>
>>>>> We have cases were we want summaries as an input and check
a combined
>>>>> summary as an output, and for these having the YAML
representation will be
>>>>> useful (we didn't have it before).
>>>>>
>>>>>
>>>>>>
>>>>>> Can summaries be standalone? I thought they could
(that'd be ideal
>>>>>> for the distributed situation - only the summary needs
to go to the 'thin
>>>>>> link' step, I think? (currently maybe only the
debug info is stripped for
>>>>>> that - but ideally other unused IR wouldn't be
shipped there as well, I
>>>>>> would think)
>>>>>>
>>>>>
>>>>> Yes conceptually they can be standalone.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> & now there's an effort to correct
that.
>>>>>>>>
>>>>>>>
>>>>>>> The main motivation here, I believe, is more to
help dev to have
>>>>>>> human readable/understandable dump for ThinLTO
bitcodes. Having to inspect
>>>>>>> separately summaries is a pain.
>>>>>>>
>>>>>>
>>>>>> Not sure I quite follow - inspect separately?
>>>>>>
>>>>>
>>>>> llvm-dis does not display summaries today, so you can't
just use
>>>>> llvm-dis like a "regular" flow.
>>>>>
>>>>>
>>>>>> How are they inspected today?
>>>>>>
>>>>>
>>>>> llvm-bcanalyzer? And now the YAML dump as well.
>>>>>
>>>>>
>>>>>> & also, I think there are test cases that want
to/are currently
>>>>>> testing summary input but do so somewhat awkwardly by
using another tool to
>>>>>> produce the summary first. Ideally the test case would
have the summary
>>>>>> written in to start, I would think, if that's a
codepath worth testing?
>>>>>>
>>>>>
>>>>> The IR already contains all the information, so why
repeating it? This
>>>>> makes the test case harder to maintain, in the vast
majority, I expect that
>>>>> if a test needs IR then it shouldn't need to include a
summary as well (and
>>>>> vice-versa).
>>>>>
>>>>> In the majority of test we have we want to check if the
importing does
>>>>> what it is supposed to do, and if the linkage are correctly
adjusted. With
>>>>> a YAML (or other) serialization for the summaries this
could indeed been
>>>>> done purely with summaries, without any IR involved.
>>>>>
>>>>> --
>>>>> Mehdi
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> - Dave
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>  --
>>>>>>> Mehdi
>>>>>>>
>>>>>>> So it seems like that would start with a discussion
of what the
>>>>>>>> right end-state would be: What the syntax in
textual IR should be, then
>>>>>>>> implementing it. I can understand implementing
such a thing in steps - it's
>>>>>>>> perhaps more involved than the COMDAT
situation. In that case starting on
>>>>>>>> either side seems fine - implementing the
emission first (hidden behind a
>>>>>>>> flag, so as not to break round-tripping in the
interim) or the parsing
>>>>>>>> first (no need to hide it behind any flags -
manually written examples can
>>>>>>>> be used as input tests).
>>>>>>>>
>>>>>>>> (& it sounds like there's some
partially implemented functionality
>>>>>>>> using a YAML format that was intended to
address how some test cases could
>>>>>>>> be written? & this might be a good basis
for the syntax - but seems to me
>>>>>>>> like it might be a bit disjointed/out of place
in the textual IR format
>>>>>>>> that's not otherwise YAML-based?)
>>>>>>>>
>>>>>>>> - Dave
>>>>>>>>
>>>>>>>> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos
via llvm-dev <
>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>
>>>>>>>>> Hey all,
>>>>>>>>>
>>>>>>>>> Below is the proposed format for the dump
of the ThinLTO module
>>>>>>>>> summary in the llvm-dis utility:
>>>>>>>>>
>>>>>>>>> > ../build/bin/llvm-dis t.o &&
cat t.o.ll
>>>>>>>>> ; ModuleID = '2.o'
>>>>>>>>> source_filename = "2.ll"
>>>>>>>>> target datalayout =
"e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>>>>>>>> target triple =
"x86_64-unknown-linux-gnu"
>>>>>>>>>
>>>>>>>>> @X = constant i32 42, section
"foo", align 4
>>>>>>>>>
>>>>>>>>> @a = weak alias i32, i32* @X
>>>>>>>>>
>>>>>>>>> define void @afun() {
>>>>>>>>>   %1 = load i32, i32* @a
>>>>>>>>>   ret void
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> define void @testtest() {
>>>>>>>>>   tail call void @boop()
>>>>>>>>>   ret void
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> declare void @boop()
>>>>>>>>>
>>>>>>>>> ; Module summary:
>>>>>>>>> ;  testtest (External linkage)
>>>>>>>>> ;    Function (2 instructions)
>>>>>>>>> ;    Calls: boop
>>>>>>>>> ;  X (External linkage)
>>>>>>>>> ;    Global Variable
>>>>>>>>> ;  afun (External linkage)
>>>>>>>>> ;    Function (2 instructions)
>>>>>>>>> ;    Refs:
>>>>>>>>> ;      a
>>>>>>>>> ;  a (Weak any linkage)
>>>>>>>>> ;    Alias (aliasee X)
>>>>>>>>>
>>>>>>>>> I've implemented the above format in
the llvm-dis utility, since
>>>>>>>>> there currently isn't really a way of
getting ThinLTO summaries in a
>>>>>>>>> human-readable format.
>>>>>>>>>
>>>>>>>>> Let me know what you think of this format,
and what information
>>>>>>>>> you think should be added/removed.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Charles
>>>>>>>>>
>>>>>>>>>
_______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>
>>>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>>> 408-460-2413 <(408)%20460-2413>
>>>
>>
>>
>
>
> --
> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
> 408-460-2413 <(408)%20460-2413>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>

-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170608/298570d7/attachment-0001.html>

Charles Saternos via llvm-dev

2017-Jun-09 19:58 UTC

head link

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

OK, I tested the hotness, and it works.
> I'd ask Charles to take it over. I think it just needs a test case and
anupdate to the usage message.

Sure - I've added the message and a quick test. The patch is here:
https://reviews.llvm.org/D34063

Thanks,
Charles

On Thu, Jun 8, 2017 at 8:01 PM, Peter Collingbourne <peter at pcc.me.uk>
wrote:
> I'd ask Charles to take it over. I think it just needs a test case and
an
> update to the usage message.
>
> Peter
>
> On Thu, Jun 8, 2017 at 4:55 PM, Teresa Johnson via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Great! For the hotness, try creating a small test case with a very hot
>> loop that iterates many times. Let me know if you are still having
trouble.
>> While the llvm-dis serialization is being discussed, I suppose at the
very
>> least this can go in with the rest of the existing YAML summary dumping
and
>> get emitted from llvm-lto2 using the patch Peter attached. Peter - do
you
>> want to add that to llvm-lto2 so that we have a way of dumping the
existing
>> YAML supported summary info to stdout, or would you rather have Charles
>> take that one over and submit it (probably just needs a test case).
>>
>> Teresa
>>
>> On Thu, Jun 8, 2017 at 4:16 PM, Charles Saternos <
>> charles.saternos at gmail.com> wrote:
>>
>>> Hey Teresa,
>>>
>>> I've updated the YAML to include the names and GUIDs for all
>>> functions/global vars/aliases. I've also added the hotness info
to the
>>> output, but for some reason, none of my tests when running with FDO
gave
>>> anything besides Unknown. I'll be looking more into this
tomorrow.
>>>
>>> Here's the current format:
>>>
>>> > ../build/bin/llvm-lto2 dump-summary b.o
>>> ---
>>> NamedGlobalValueMap:
>>>   :
>>>     - GUID:            3762489268811518743
>>>       Kind:            GlobalVar
>>>       Linkage:         PrivateLinkage
>>>       NotEligibleToImport: true
>>>       Live:            false
>>>   cold:
>>>     - GUID:            11668175513417606517
>>>       Kind:            Function
>>>       Linkage:         ExternalLinkage
>>>       NotEligibleToImport: true
>>>       Live:            false
>>>       InstCount:       5
>>>       Calls:
>>>         - Name:            puts
>>>           GUID:            8979701042202144121
>>>           Hotness:         Unknown
>>>   fib:
>>>     - GUID:            8667248078361406812
>>>       Kind:            Function
>>>       Linkage:         ExternalLinkage
>>>       NotEligibleToImport: true
>>>       Live:            false
>>>       InstCount:       26
>>>       Calls:
>>>         - Name:            fib
>>>           GUID:            8667248078361406812
>>>           Hotness:         Unknown
>>>   hot:
>>>     - GUID:            10177652421713147431
>>>       Kind:            Function
>>>       Linkage:         ExternalLinkage
>>>       NotEligibleToImport: true
>>>       Live:            false
>>>       InstCount:       14
>>>       Calls:
>>>         - Name:            fib
>>>           GUID:            8667248078361406812
>>>           Hotness:         Unknown
>>>         - Name:            printf
>>>           GUID:            7383291119112528047
>>>           Hotness:         Unknown
>>>   llvm.used:
>>>     - GUID:            15665353970260777610
>>>       Kind:            GlobalVar
>>>       Linkage:         AppendingLinkage
>>>       NotEligibleToImport: true
>>>       Live:            true
>>> TypeIdMap:
>>> WithGlobalValueDeadStripping: false
>>> ...
>>>
>>> Thanks,
>>> Charles
>>>
>>>
>>> On Wed, Jun 7, 2017 at 12:38 PM, Teresa Johnson <tejohnson at
google.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Jun 7, 2017 at 8:58 AM, Charles Saternos <
>>>> charles.saternos at gmail.com> wrote:
>>>>
>>>>> Alright, now it outputs YAML in the following format:
>>>>>
>>>>> ---
>>>>> NamedGlobalValueMap:
>>>>>   X:
>>>>>     - Kind:            GlobalVar
>>>>>       Linkage:         ExternalLinkage
>>>>>       NotEligibleToImport: false
>>>>>       Live:            false
>>>>>   a:
>>>>>     - Kind:            Alias
>>>>>       Linkage:         WeakAnyLinkage
>>>>>       NotEligibleToImport: false
>>>>>       Live:            false
>>>>>       AliaseeGUID:     1881667236089500162
>>>>>   afun:
>>>>>     - Kind:            Function
>>>>>       Linkage:         ExternalLinkage
>>>>>       NotEligibleToImport: false
>>>>>       Live:            false
>>>>>       InstCount:       2
>>>>>   testtest:
>>>>>     - Kind:            Function
>>>>>       Linkage:         ExternalLinkage
>>>>>       NotEligibleToImport: false
>>>>>       Live:            false
>>>>>       InstCount:       2
>>>>>       Calls:
>>>>>         - Function:        14471680721094503013
>>>>> TypeIdMap:
>>>>> WithGlobalValueDeadStripping: false
>>>>> ...
>>>>>
>>>>> Any thoughts on the new format?
>>>>>
>>>>
>>>> Thanks, Charles. The main improvement I think we would want is
to
>>>> output value names instead of the GUID. Can you build up a map
from GUID ->
>>>> name ahead of time and use those like you were for your initial
patch?
>>>> Actually, I also think it would be useful to emit both the GUID
and the
>>>> name, since the combined index will eventually only have the
GUID, so this
>>>> would give a mapping to use for at least the visual inspection
of the
>>>> combined index.
>>>>
>>>> Also, would be good to see an example with FDO, to make sure
the
>>>> hotness info of the calls is emitted.
>>>>
>>>> Teresa
>>>>
>>>>
>>>>> Thanks,
>>>>> Charles
>>>>>
>>>>> On Tue, Jun 6, 2017 at 5:21 PM, Mehdi AMINI <joker.eph
at gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> 2017-06-06 13:38 GMT-07:00 David Blaikie <dblaikie
at gmail.com>:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jun 6, 2017 at 1:26 PM Mehdi AMINI
<joker.eph at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> 2017-06-05 14:27 GMT-07:00 David Blaikie via
llvm-dev <
>>>>>>>> llvm-dev at lists.llvm.org>:
>>>>>>>>
>>>>>>>>> I know there's been a bunch of
discussion here already, but I was
>>>>>>>>> wondering if perhaps someone (probably
Teresa? Peter?) could:
>>>>>>>>>
>>>>>>>>> 1) summarize the current state
>>>>>>>>> 2) describe the end-goal
>>>>>>>>> 3) describe what steps (& how this
patch relates) are planned to
>>>>>>>>> get to (2)
>>>>>>>>>
>>>>>>>>> My naive thoughts, not being intimately
familiar with any of this:
>>>>>>>>> Usually bitcode and textual IR support go
in together or around the same
>>>>>>>>> time, and designed that way from the start
(take r211920 for examaple,
>>>>>>>>> which added an explicit representation of
COMDATs to the IR). This seems to
>>>>>>>>> have been an oversight in the
implementation of IR summaries (is that an
>>>>>>>>> accurate representation/statement?)
>>>>>>>>>
>>>>>>>>
>>>>>>>> More or less: it was not an oversight.
>>>>>>>> The summaries are not really part of the IR, it
is more like an
>>>>>>>> "analysis result" that is serialized.
It can always be recomputed from the
>>>>>>>> IR. This aspect makes it quite
"special", it is the only analysis result
>>>>>>>> that I know of that we serialize.
>>>>>>>>
>>>>>>>
>>>>>>> The use list work seems pretty similar in some ways
(granted, can't
>>>>>>> be recomputed to match, hence the desire to
serialize it for test case
>>>>>>> implementation).
>>>>>>>
>>>>>>
>>>>>> I see use-list as a leaky implementation detail of the
IR that we
>>>>>> serialized because it impact the processing of the IR.
>>>>>>
>>>>>> Summaries are more like serializing the CFG for
example.
>>>>>>
>>>>>>
>>>>>>> But it looks like the same is true here to a degree
- there are test
>>>>>>> cases that exercise the summary handling, so they
want summaries for input
>>>>>>> (for now, I think, I've seen test cases that
run another LLVM tool to
>>>>>>> insert/create a summary to then feed that back in
for a test), or to test
>>>>>>> that the resulting summary is correct.
>>>>>>>
>>>>>>
>>>>>> We have cases were we want summaries as an input and
check a combined
>>>>>> summary as an output, and for these having the YAML
representation will be
>>>>>> useful (we didn't have it before).
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Can summaries be standalone? I thought they could
(that'd be ideal
>>>>>>> for the distributed situation - only the summary
needs to go to the 'thin
>>>>>>> link' step, I think? (currently maybe only the
debug info is stripped for
>>>>>>> that - but ideally other unused IR wouldn't be
shipped there as well, I
>>>>>>> would think)
>>>>>>>
>>>>>>
>>>>>> Yes conceptually they can be standalone.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> & now there's an effort to correct
that.
>>>>>>>>>
>>>>>>>>
>>>>>>>> The main motivation here, I believe, is more to
help dev to have
>>>>>>>> human readable/understandable dump for ThinLTO
bitcodes. Having to inspect
>>>>>>>> separately summaries is a pain.
>>>>>>>>
>>>>>>>
>>>>>>> Not sure I quite follow - inspect separately?
>>>>>>>
>>>>>>
>>>>>> llvm-dis does not display summaries today, so you
can't just use
>>>>>> llvm-dis like a "regular" flow.
>>>>>>
>>>>>>
>>>>>>> How are they inspected today?
>>>>>>>
>>>>>>
>>>>>> llvm-bcanalyzer? And now the YAML dump as well.
>>>>>>
>>>>>>
>>>>>>> & also, I think there are test cases that want
to/are currently
>>>>>>> testing summary input but do so somewhat awkwardly
by using another tool to
>>>>>>> produce the summary first. Ideally the test case
would have the summary
>>>>>>> written in to start, I would think, if that's a
codepath worth testing?
>>>>>>>
>>>>>>
>>>>>> The IR already contains all the information, so why
repeating it?
>>>>>> This makes the test case harder to maintain, in the
vast majority, I expect
>>>>>> that if a test needs IR then it shouldn't need to
include a summary as well
>>>>>> (and vice-versa).
>>>>>>
>>>>>> In the majority of test we have we want to check if the
importing
>>>>>> does what it is supposed to do, and if the linkage are
correctly adjusted.
>>>>>> With a YAML (or other) serialization for the summaries
this could indeed
>>>>>> been done purely with summaries, without any IR
involved.
>>>>>>
>>>>>> --
>>>>>> Mehdi
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> - Dave
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>  --
>>>>>>>> Mehdi
>>>>>>>>
>>>>>>>> So it seems like that would start with a
discussion of what the
>>>>>>>>> right end-state would be: What the syntax
in textual IR should be, then
>>>>>>>>> implementing it. I can understand
implementing such a thing in steps - it's
>>>>>>>>> perhaps more involved than the COMDAT
situation. In that case starting on
>>>>>>>>> either side seems fine - implementing the
emission first (hidden behind a
>>>>>>>>> flag, so as not to break round-tripping in
the interim) or the parsing
>>>>>>>>> first (no need to hide it behind any flags
- manually written examples can
>>>>>>>>> be used as input tests).
>>>>>>>>>
>>>>>>>>> (& it sounds like there's some
partially implemented functionality
>>>>>>>>> using a YAML format that was intended to
address how some test cases could
>>>>>>>>> be written? & this might be a good
basis for the syntax - but seems to me
>>>>>>>>> like it might be a bit disjointed/out of
place in the textual IR format
>>>>>>>>> that's not otherwise YAML-based?)
>>>>>>>>>
>>>>>>>>> - Dave
>>>>>>>>>
>>>>>>>>> On Fri, Jun 2, 2017 at 8:46 AM Charles
Saternos via llvm-dev <
>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>
>>>>>>>>>> Hey all,
>>>>>>>>>>
>>>>>>>>>> Below is the proposed format for the
dump of the ThinLTO module
>>>>>>>>>> summary in the llvm-dis utility:
>>>>>>>>>>
>>>>>>>>>> > ../build/bin/llvm-dis t.o
&& cat t.o.ll
>>>>>>>>>> ; ModuleID = '2.o'
>>>>>>>>>> source_filename = "2.ll"
>>>>>>>>>> target datalayout =
"e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>>>>>>>>> target triple =
"x86_64-unknown-linux-gnu"
>>>>>>>>>>
>>>>>>>>>> @X = constant i32 42, section
"foo", align 4
>>>>>>>>>>
>>>>>>>>>> @a = weak alias i32, i32* @X
>>>>>>>>>>
>>>>>>>>>> define void @afun() {
>>>>>>>>>>   %1 = load i32, i32* @a
>>>>>>>>>>   ret void
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> define void @testtest() {
>>>>>>>>>>   tail call void @boop()
>>>>>>>>>>   ret void
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> declare void @boop()
>>>>>>>>>>
>>>>>>>>>> ; Module summary:
>>>>>>>>>> ;  testtest (External linkage)
>>>>>>>>>> ;    Function (2 instructions)
>>>>>>>>>> ;    Calls: boop
>>>>>>>>>> ;  X (External linkage)
>>>>>>>>>> ;    Global Variable
>>>>>>>>>> ;  afun (External linkage)
>>>>>>>>>> ;    Function (2 instructions)
>>>>>>>>>> ;    Refs:
>>>>>>>>>> ;      a
>>>>>>>>>> ;  a (Weak any linkage)
>>>>>>>>>> ;    Alias (aliasee X)
>>>>>>>>>>
>>>>>>>>>> I've implemented the above format
in the llvm-dis utility, since
>>>>>>>>>> there currently isn't really a way
of getting ThinLTO summaries in a
>>>>>>>>>> human-readable format.
>>>>>>>>>>
>>>>>>>>>> Let me know what you think of this
format, and what information
>>>>>>>>>> you think should be added/removed.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Charles
>>>>>>>>>>
>>>>>>>>>>
_______________________________________________
>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
_______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Teresa Johnson |  Software Engineer |  tejohnson at google.com
|
>>>> 408-460-2413 <(408)%20460-2413>
>>>>
>>>
>>>
>>
>>
>> --
>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>> 408-460-2413 <(408)%20460-2413>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
>
> --
> --
> Peter
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170609/2db5579b/attachment.html>

llvm dev - Jun 2017 - [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format