thr3ads.net - llvm dev - [llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format [Jun 2017]

If this information is useful, please help other people find it:
Share via:

Mehdi AMINI via llvm-dev

2017-Jun-06 20:26 UTC

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

2017-06-05 14:27 GMT-07:00 David Blaikie via llvm-dev <
llvm-dev at lists.llvm.org>:
> I know there's been a bunch of discussion here already, but I was
> wondering if perhaps someone (probably Teresa? Peter?) could:
>
> 1) summarize the current state
> 2) describe the end-goal
> 3) describe what steps (& how this patch relates) are planned to get to
(2)
>
> My naive thoughts, not being intimately familiar with any of this: Usually
> bitcode and textual IR support go in together or around the same time, and
> designed that way from the start (take r211920 for examaple, which added an
> explicit representation of COMDATs to the IR). This seems to have been an
> oversight in the implementation of IR summaries (is that an accurate
> representation/statement?)
>
More or less: it was not an oversight.
The summaries are not really part of the IR, it is more like an "analysis
result" that is serialized. It can always be recomputed from the IR. This
aspect makes it quite "special", it is the only analysis result that I
know
of that we serialize.

> & now there's an effort to correct that.
>
The main motivation here, I believe, is more to help dev to have human
readable/understandable dump for ThinLTO bitcodes. Having to inspect
separately summaries is a pain.

 --
Mehdi

So it seems like that would start with a discussion of what the
right> end-state would be: What the syntax in textual IR should be, then
> implementing it. I can understand implementing such a thing in steps -
it's
> perhaps more involved than the COMDAT situation. In that case starting on
> either side seems fine - implementing the emission first (hidden behind a
> flag, so as not to break round-tripping in the interim) or the parsing
> first (no need to hide it behind any flags - manually written examples can
> be used as input tests).
>
> (& it sounds like there's some partially implemented functionality
using a
> YAML format that was intended to address how some test cases could be
> written? & this might be a good basis for the syntax - but seems to me
like
> it might be a bit disjointed/out of place in the textual IR format
that's
> not otherwise YAML-based?)
>
> - Dave
>
> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hey all,
>>
>> Below is the proposed format for the dump of the ThinLTO module summary
>> in the llvm-dis utility:
>>
>> > ../build/bin/llvm-dis t.o && cat t.o.ll
>> ; ModuleID = '2.o'
>> source_filename = "2.ll"
>> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>> target triple = "x86_64-unknown-linux-gnu"
>>
>> @X = constant i32 42, section "foo", align 4
>>
>> @a = weak alias i32, i32* @X
>>
>> define void @afun() {
>>   %1 = load i32, i32* @a
>>   ret void
>> }
>>
>> define void @testtest() {
>>   tail call void @boop()
>>   ret void
>> }
>>
>> declare void @boop()
>>
>> ; Module summary:
>> ;  testtest (External linkage)
>> ;    Function (2 instructions)
>> ;    Calls: boop
>> ;  X (External linkage)
>> ;    Global Variable
>> ;  afun (External linkage)
>> ;    Function (2 instructions)
>> ;    Refs:
>> ;      a
>> ;  a (Weak any linkage)
>> ;    Alias (aliasee X)
>>
>> I've implemented the above format in the llvm-dis utility, since
there
>> currently isn't really a way of getting ThinLTO summaries in a
>> human-readable format.
>>
>> Let me know what you think of this format, and what information you
think
>> should be added/removed.
>>
>> Thanks,
>> Charles
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170606/d1c1569f/attachment.html>

David Blaikie via llvm-dev

2017-Jun-06 20:38 UTC

head link

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

On Tue, Jun 6, 2017 at 1:26 PM Mehdi AMINI <joker.eph at gmail.com> wrote:
> 2017-06-05 14:27 GMT-07:00 David Blaikie via llvm-dev <
> llvm-dev at lists.llvm.org>:
>
>> I know there's been a bunch of discussion here already, but I was
>> wondering if perhaps someone (probably Teresa? Peter?) could:
>>
>> 1) summarize the current state
>> 2) describe the end-goal
>> 3) describe what steps (& how this patch relates) are planned to
get to
>> (2)
>>
>> My naive thoughts, not being intimately familiar with any of this:
>> Usually bitcode and textual IR support go in together or around the
same
>> time, and designed that way from the start (take r211920 for examaple,
>> which added an explicit representation of COMDATs to the IR). This
seems to
>> have been an oversight in the implementation of IR summaries (is that
an
>> accurate representation/statement?)
>>
>
> More or less: it was not an oversight.
> The summaries are not really part of the IR, it is more like an
"analysis
> result" that is serialized. It can always be recomputed from the IR.
This
> aspect makes it quite "special", it is the only analysis result
that I know
> of that we serialize.
>
The use list work seems pretty similar in some ways (granted, can't be
recomputed to match, hence the desire to serialize it for test case
implementation).

But it looks like the same is true here to a degree - there are test cases
that exercise the summary handling, so they want summaries for input (for
now, I think, I've seen test cases that run another LLVM tool to
insert/create a summary to then feed that back in for a test), or to test
that the resulting summary is correct.

Can summaries be standalone? I thought they could (that'd be ideal for the
distributed situation - only the summary needs to go to the 'thin link'
step, I think? (currently maybe only the debug info is stripped for that -
but ideally other unused IR wouldn't be shipped there as well, I would
think)

>
>
>> & now there's an effort to correct that.
>>
>
> The main motivation here, I believe, is more to help dev to have human
> readable/understandable dump for ThinLTO bitcodes. Having to inspect
> separately summaries is a pain.
>
Not sure I quite follow - inspect separately? How are they inspected today?

& also, I think there are test cases that want to/are currently testing
summary input but do so somewhat awkwardly by using another tool to produce
the summary first. Ideally the test case would have the summary written in
to start, I would think, if that's a codepath worth testing?

- Dave

>
>  --
> Mehdi
>
> So it seems like that would start with a discussion of what the right
>> end-state would be: What the syntax in textual IR should be, then
>> implementing it. I can understand implementing such a thing in steps -
it's
>> perhaps more involved than the COMDAT situation. In that case starting
on
>> either side seems fine - implementing the emission first (hidden behind
a
>> flag, so as not to break round-tripping in the interim) or the parsing
>> first (no need to hide it behind any flags - manually written examples
can
>> be used as input tests).
>>
>> (& it sounds like there's some partially implemented
functionality using
>> a YAML format that was intended to address how some test cases could be
>> written? & this might be a good basis for the syntax - but seems to
me like
>> it might be a bit disjointed/out of place in the textual IR format
that's
>> not otherwise YAML-based?)
>>
>> - Dave
>>
>> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hey all,
>>>
>>> Below is the proposed format for the dump of the ThinLTO module
summary
>>> in the llvm-dis utility:
>>>
>>> > ../build/bin/llvm-dis t.o && cat t.o.ll
>>> ; ModuleID = '2.o'
>>> source_filename = "2.ll"
>>> target datalayout =
"e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>> target triple = "x86_64-unknown-linux-gnu"
>>>
>>> @X = constant i32 42, section "foo", align 4
>>>
>>> @a = weak alias i32, i32* @X
>>>
>>> define void @afun() {
>>>   %1 = load i32, i32* @a
>>>   ret void
>>> }
>>>
>>> define void @testtest() {
>>>   tail call void @boop()
>>>   ret void
>>> }
>>>
>>> declare void @boop()
>>>
>>> ; Module summary:
>>> ;  testtest (External linkage)
>>> ;    Function (2 instructions)
>>> ;    Calls: boop
>>> ;  X (External linkage)
>>> ;    Global Variable
>>> ;  afun (External linkage)
>>> ;    Function (2 instructions)
>>> ;    Refs:
>>> ;      a
>>> ;  a (Weak any linkage)
>>> ;    Alias (aliasee X)
>>>
>>> I've implemented the above format in the llvm-dis utility,
since there
>>> currently isn't really a way of getting ThinLTO summaries in a
>>> human-readable format.
>>>
>>> Let me know what you think of this format, and what information you
>>> think should be added/removed.
>>>
>>> Thanks,
>>> Charles
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170606/e0481a97/attachment.html>

Mehdi AMINI via llvm-dev

2017-Jun-06 21:21 UTC

head link

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

2017-06-06 13:38 GMT-07:00 David Blaikie <dblaikie at gmail.com>:
>
>
> On Tue, Jun 6, 2017 at 1:26 PM Mehdi AMINI <joker.eph at gmail.com>
wrote:
>
>> 2017-06-05 14:27 GMT-07:00 David Blaikie via llvm-dev <
>> llvm-dev at lists.llvm.org>:
>>
>>> I know there's been a bunch of discussion here already, but I
was
>>> wondering if perhaps someone (probably Teresa? Peter?) could:
>>>
>>> 1) summarize the current state
>>> 2) describe the end-goal
>>> 3) describe what steps (& how this patch relates) are planned
to get to
>>> (2)
>>>
>>> My naive thoughts, not being intimately familiar with any of this:
>>> Usually bitcode and textual IR support go in together or around the
same
>>> time, and designed that way from the start (take r211920 for
examaple,
>>> which added an explicit representation of COMDATs to the IR). This
seems to
>>> have been an oversight in the implementation of IR summaries (is
that an
>>> accurate representation/statement?)
>>>
>>
>> More or less: it was not an oversight.
>> The summaries are not really part of the IR, it is more like an
"analysis
>> result" that is serialized. It can always be recomputed from the
IR. This
>> aspect makes it quite "special", it is the only analysis
result that I know
>> of that we serialize.
>>
>
> The use list work seems pretty similar in some ways (granted, can't be
> recomputed to match, hence the desire to serialize it for test case
> implementation).
>
I see use-list as a leaky implementation detail of the IR that we
serialized because it impact the processing of the IR.

Summaries are more like serializing the CFG for example.

> But it looks like the same is true here to a degree - there are test cases
> that exercise the summary handling, so they want summaries for input (for
> now, I think, I've seen test cases that run another LLVM tool to
> insert/create a summary to then feed that back in for a test), or to test
> that the resulting summary is correct.
>
We have cases were we want summaries as an input and check a combined
summary as an output, and for these having the YAML representation will be
useful (we didn't have it before).

>
> Can summaries be standalone? I thought they could (that'd be ideal for
the
> distributed situation - only the summary needs to go to the 'thin
link'
> step, I think? (currently maybe only the debug info is stripped for that -
> but ideally other unused IR wouldn't be shipped there as well, I would
> think)
>
Yes conceptually they can be standalone.

>
>
>>
>>
>>> & now there's an effort to correct that.
>>>
>>
>> The main motivation here, I believe, is more to help dev to have human
>> readable/understandable dump for ThinLTO bitcodes. Having to inspect
>> separately summaries is a pain.
>>
>
> Not sure I quite follow - inspect separately?
>
llvm-dis does not display summaries today, so you can't just use llvm-dis
like a "regular" flow.

> How are they inspected today?
>
llvm-bcanalyzer? And now the YAML dump as well.

> & also, I think there are test cases that want to/are currently testing
> summary input but do so somewhat awkwardly by using another tool to produce
> the summary first. Ideally the test case would have the summary written in
> to start, I would think, if that's a codepath worth testing?
>
The IR already contains all the information, so why repeating it? This
makes the test case harder to maintain, in the vast majority, I expect that
if a test needs IR then it shouldn't need to include a summary as well (and
vice-versa).

In the majority of test we have we want to check if the importing does what
it is supposed to do, and if the linkage are correctly adjusted. With a
YAML (or other) serialization for the summaries this could indeed been done
purely with summaries, without any IR involved.

-- 
Mehdi





>
> - Dave
>
>
>>
>>  --
>> Mehdi
>>
>> So it seems like that would start with a discussion of what the right
>>> end-state would be: What the syntax in textual IR should be, then
>>> implementing it. I can understand implementing such a thing in
steps - it's
>>> perhaps more involved than the COMDAT situation. In that case
starting on
>>> either side seems fine - implementing the emission first (hidden
behind a
>>> flag, so as not to break round-tripping in the interim) or the
parsing
>>> first (no need to hide it behind any flags - manually written
examples can
>>> be used as input tests).
>>>
>>> (& it sounds like there's some partially implemented
functionality using
>>> a YAML format that was intended to address how some test cases
could be
>>> written? & this might be a good basis for the syntax - but
seems to me like
>>> it might be a bit disjointed/out of place in the textual IR format
that's
>>> not otherwise YAML-based?)
>>>
>>> - Dave
>>>
>>> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hey all,
>>>>
>>>> Below is the proposed format for the dump of the ThinLTO module
summary
>>>> in the llvm-dis utility:
>>>>
>>>> > ../build/bin/llvm-dis t.o && cat t.o.ll
>>>> ; ModuleID = '2.o'
>>>> source_filename = "2.ll"
>>>> target datalayout =
"e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>>> target triple = "x86_64-unknown-linux-gnu"
>>>>
>>>> @X = constant i32 42, section "foo", align 4
>>>>
>>>> @a = weak alias i32, i32* @X
>>>>
>>>> define void @afun() {
>>>>   %1 = load i32, i32* @a
>>>>   ret void
>>>> }
>>>>
>>>> define void @testtest() {
>>>>   tail call void @boop()
>>>>   ret void
>>>> }
>>>>
>>>> declare void @boop()
>>>>
>>>> ; Module summary:
>>>> ;  testtest (External linkage)
>>>> ;    Function (2 instructions)
>>>> ;    Calls: boop
>>>> ;  X (External linkage)
>>>> ;    Global Variable
>>>> ;  afun (External linkage)
>>>> ;    Function (2 instructions)
>>>> ;    Refs:
>>>> ;      a
>>>> ;  a (Weak any linkage)
>>>> ;    Alias (aliasee X)
>>>>
>>>> I've implemented the above format in the llvm-dis utility,
since there
>>>> currently isn't really a way of getting ThinLTO summaries
in a
>>>> human-readable format.
>>>>
>>>> Let me know what you think of this format, and what information
you
>>>> think should be added/removed.
>>>>
>>>> Thanks,
>>>> Charles
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170606/e06f6069/attachment.html>

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - Jun 2017 - [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

Reasonably Related Threads