thr3ads.net - llvm dev - [LLVMdev] RFC: Machine Level IR text-based serialization format [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Alex L

2015-Apr-29 22:24 UTC

[LLVMdev] RFC: Machine Level IR text-based serialization format

2015-04-29 11:40 GMT-07:00 Duncan P. N. Exon Smith <dexonsmith at
apple.com>:
>
> > On 2015-Apr-29, at 06:40, Krzysztof Parzyszek <kparzysz at
codeaurora.org>
> wrote:
> >
> > On 4/28/2015 7:13 PM, Alex L wrote:
> >>
> >>
> >> 2015-04-28 16:26 GMT-07:00 Matthias Braun <matze at braunis.de
> >> <mailto:matze at braunis.de>>:
> >>
> >>    For that use case it is worth keeping the following things in
mind:
> >>    - Please try to keep the output of the various dump functions,
esp.
> >>    MachineInstr::dump(), MachineOperand::dump(),
> >>    MachineBasicBlock::dump() as close as possible to the format
you use
> >>    for serializing.
> >> [...]
> >>
> >> Ideally the new syntax would replace the existing print/dump
syntax. The
> >> new syntax will lead to certain missing information when
> >> this information can be inferred (e.g. the TiedTo and
IsEarlyClobber
> >> attributes for register operands that I mentioned earlier in this
> thread),
> >> so maybe we could have some sort of verbose dumping option where
> >> absolutely everything is dumped.
> >
> >
> > I think that the new syntax is less readable than the current format
of
> the "dump" functions, and in the long term it would be better to
have
> something more human-friendly.  However, using YAML has the advantage that
> it's easier to parse it than the direct output of "dump" and
so it will
> take less time to implement a YAML-based solution.  My concern is that you
> may run out of time to complete this and the file format is not the most
> important thing in this project.  Getting it to work, if only as a proof of
> concept, would be very helpful to everyone.  Coming up with a fancier
> grammar and implementing a parser for it could be done later on top of the
> initial implementation.
> >
> > -Krzysztof
>
> Until I got to this email, I was opposed to using YAML here -- I'd
> prefer a custom grammar and parser -- but I find Krzysztof's point
> here pretty convincing.
>
> Starting with a (hybrid) YAML representation seems like a reasonable
> way to bootstrap a machine IR.  Once it's in place and working, we
> can come back and strip away the YAML parts until it's human-
> friendly.  (And since YAML is machine-friendly, upgrade scripts for
> testcases should be straightforward.)
>
I think that this would be a good approach.
I will work on the proposed YAML hybrid format for now and will begin
sending out the patches soon. Once it's working, people can evaluate it
for themselves and see if it suits them or if we need to change it to a
custom format.

>
> BTW, we probably need some sort of LangRef document for this.  Maybe
> docs/MIRLangRef.rst?

That's fine with me.

Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150429/1f00799f/attachment.html>

Hayden Livingston

2015-Apr-30 02:13 UTC

head link

[LLVMdev] RFC: Machine Level IR text-based serialization format

What is missing in the current textual format that doesn't allow going
all the way to machine code?

Is the reason for this project because the current .LL format can't
always be put to bitcode?

On Wed, Apr 29, 2015 at 3:24 PM, Alex L <arphaman at gmail.com>
wrote:>
>
> 2015-04-29 11:40 GMT-07:00 Duncan P. N. Exon Smith <dexonsmith at
apple.com>:
>
>>
>> > On 2015-Apr-29, at 06:40, Krzysztof Parzyszek <kparzysz at
codeaurora.org>
>> > wrote:
>> >
>> > On 4/28/2015 7:13 PM, Alex L wrote:
>> >>
>> >>
>> >> 2015-04-28 16:26 GMT-07:00 Matthias Braun <matze at
braunis.de
>> >> <mailto:matze at braunis.de>>:
>> >>
>> >>    For that use case it is worth keeping the following things
in mind:
>> >>    - Please try to keep the output of the various dump
functions, esp.
>> >>    MachineInstr::dump(), MachineOperand::dump(),
>> >>    MachineBasicBlock::dump() as close as possible to the
format you use
>> >>    for serializing.
>> >> [...]
>> >>
>> >> Ideally the new syntax would replace the existing print/dump
syntax.
>> >> The
>> >> new syntax will lead to certain missing information when
>> >> this information can be inferred (e.g. the TiedTo and
IsEarlyClobber
>> >> attributes for register operands that I mentioned earlier in
this
>> >> thread),
>> >> so maybe we could have some sort of verbose dumping option
where
>> >> absolutely everything is dumped.
>> >
>> >
>> > I think that the new syntax is less readable than the current
format of
>> > the "dump" functions, and in the long term it would be
better to have
>> > something more human-friendly.  However, using YAML has the
advantage that
>> > it's easier to parse it than the direct output of
"dump" and so it will take
>> > less time to implement a YAML-based solution.  My concern is that
you may
>> > run out of time to complete this and the file format is not the
most
>> > important thing in this project.  Getting it to work, if only as a
proof of
>> > concept, would be very helpful to everyone.  Coming up with a
fancier
>> > grammar and implementing a parser for it could be done later on
top of the
>> > initial implementation.
>> >
>> > -Krzysztof
>>
>> Until I got to this email, I was opposed to using YAML here -- I'd
>> prefer a custom grammar and parser -- but I find Krzysztof's point
>> here pretty convincing.
>>
>> Starting with a (hybrid) YAML representation seems like a reasonable
>> way to bootstrap a machine IR.  Once it's in place and working, we
>> can come back and strip away the YAML parts until it's human-
>> friendly.  (And since YAML is machine-friendly, upgrade scripts for
>> testcases should be straightforward.)
>
>
> I think that this would be a good approach.
> I will work on the proposed YAML hybrid format for now and will begin
> sending out the patches soon. Once it's working, people can evaluate it
> for themselves and see if it suits them or if we need to change it to a
> custom format.
>
>>
>>
>> BTW, we probably need some sort of LangRef document for this.  Maybe
>> docs/MIRLangRef.rst?
>
>
> That's fine with me.
>
> Alex
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Duncan P. N. Exon Smith

2015-Apr-30 02:44 UTC

head link

[LLVMdev] RFC: Machine Level IR text-based serialization format

> On 2015 Apr 29, at 19:13, Hayden Livingston <halivingston at
gmail.com> wrote:
> 
> What is missing in the current textual format that doesn't allow going
> all the way to machine code?
Nothing.

What's missing is the ability to serialize the machine level itself.
Since many passes have to run to get from .ll to .s, it's currently
hard (impossible?) to test individual machine level passes robustly.
Having a way to serialize machine IR will let us test each pass in
isolation.
> Is the reason for this project because the current .LL format can't
> always be put to bitcode?
Nope, .ll and .bc can represent the same things.
> 
> On Wed, Apr 29, 2015 at 3:24 PM, Alex L <arphaman at gmail.com>
wrote:
>> 
>> 
>> 2015-04-29 11:40 GMT-07:00 Duncan P. N. Exon Smith <dexonsmith at
apple.com>:
>> 
>>> 
>>>> On 2015-Apr-29, at 06:40, Krzysztof Parzyszek <kparzysz at
codeaurora.org>
>>>> wrote:
>>>> 
>>>> On 4/28/2015 7:13 PM, Alex L wrote:
>>>>> 
>>>>> 
>>>>> 2015-04-28 16:26 GMT-07:00 Matthias Braun <matze at
braunis.de
>>>>> <mailto:matze at braunis.de>>:
>>>>> 
>>>>>   For that use case it is worth keeping the following
things in mind:
>>>>>   - Please try to keep the output of the various dump
functions, esp.
>>>>>   MachineInstr::dump(), MachineOperand::dump(),
>>>>>   MachineBasicBlock::dump() as close as possible to the
format you use
>>>>>   for serializing.
>>>>> [...]
>>>>> 
>>>>> Ideally the new syntax would replace the existing
print/dump syntax.
>>>>> The
>>>>> new syntax will lead to certain missing information when
>>>>> this information can be inferred (e.g. the TiedTo and
IsEarlyClobber
>>>>> attributes for register operands that I mentioned earlier
in this
>>>>> thread),
>>>>> so maybe we could have some sort of verbose dumping option
where
>>>>> absolutely everything is dumped.
>>>> 
>>>> 
>>>> I think that the new syntax is less readable than the current
format of
>>>> the "dump" functions, and in the long term it would
be better to have
>>>> something more human-friendly.  However, using YAML has the
advantage that
>>>> it's easier to parse it than the direct output of
"dump" and so it will take
>>>> less time to implement a YAML-based solution.  My concern is
that you may
>>>> run out of time to complete this and the file format is not the
most
>>>> important thing in this project.  Getting it to work, if only
as a proof of
>>>> concept, would be very helpful to everyone.  Coming up with a
fancier
>>>> grammar and implementing a parser for it could be done later on
top of the
>>>> initial implementation.
>>>> 
>>>> -Krzysztof
>>> 
>>> Until I got to this email, I was opposed to using YAML here --
I'd
>>> prefer a custom grammar and parser -- but I find Krzysztof's
point
>>> here pretty convincing.
>>> 
>>> Starting with a (hybrid) YAML representation seems like a
reasonable
>>> way to bootstrap a machine IR.  Once it's in place and working,
we
>>> can come back and strip away the YAML parts until it's human-
>>> friendly.  (And since YAML is machine-friendly, upgrade scripts for
>>> testcases should be straightforward.)
>> 
>> 
>> I think that this would be a good approach.
>> I will work on the proposed YAML hybrid format for now and will begin
>> sending out the patches soon. Once it's working, people can
evaluate it
>> for themselves and see if it suits them or if we need to change it to a
>> custom format.
>> 
>>> 
>>> 
>>> BTW, we probably need some sort of LangRef document for this. 
Maybe
>>> docs/MIRLangRef.rst?
>> 
>> 
>> That's fine with me.
>> 
>> Alex
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>

llvm dev - Apr 2015 - [LLVMdev] RFC: Machine Level IR text-based serialization format

[LLVMdev] RFC: Machine Level IR text-based serialization format

[LLVMdev] RFC: Machine Level IR text-based serialization format

[LLVMdev] RFC: Machine Level IR text-based serialization format