thr3ads.net - llvm dev - [LLVMdev] Proposal: extended MDString syntax [Jun 2013]

If this information is useful, please help other people find it:
Share via:

Chandler Carruth

2013-Jun-27 17:12 UTC

[LLVMdev] Proposal: extended MDString syntax

On Thu, Jun 27, 2013 at 9:50 AM, Jakob Stoklund Olesen <stoklund at
2pi.dk>wrote:
> On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at gmail.com>
wrote:
>
> > So inverting it so that MI contains LLVM IR instead of the other way
> > around? Then we'd need a serialization format for MI that happened
to
> > include a way of serializing LLVM IR within. From a quick "hey,
this
> > seems reasonable" the idea of embedding the MI into the IR rather
than
> > the other way around seems to make sense since we have already have
> > code to serialize the IR.
>
> I’d suggest something based on YAML which would allow you to include IR
> verbatim just by indenting it.
>
We can also use YAML embedded inside IR, potentially using the string
syntax Dan proposed or any other number of embedding mechanisms.

I like using YAML to represent the somewhat arbitrary datastructures of MI
so that we don't spend a lot of time inventing clever syntax for something
that has much more limited uses than the actual IR. I haven't heard anyone
really object to it.

However, I do think it's an open question as to whether to embed IR in a MI
container, or MI in an IR container. A few observations:

- No one has pointed out any really fundamental *problems* with any of the
approaches. I think both approaches can be made to work with reasonable
amounts of effort, and neither has really fundamental design problems.

- Different use cases will be more or less easy to write in different
forms. For example, Jakob's point:
> The IR module should be optional when serializing MI. The back-pointers
> from MI to IR are not required, and I can imagine many useful test cases
> that won’t need them.

I've heard Dan and others say exactly the opposite -- that MI should be
optional. I suspect that some test cases are more MI focused, and some are
less. But I don't see either being optional as a hard prerequisite.


So, here is my concrete suggestion: if all of these approaches seem to work
and there aren't huge downsides but only reasonable tradeoffs, let the
folks writing the patches make the decision. At the moment that appears to
be Dan and maybe Bob. Is there a reason to not let them pick the design
they want to make forward progress with and run with it? I think that will
be much more productive and get us back to the important part: testing
MI-level passes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130627/75768afd/attachment.html>

Jakob Stoklund Olesen

2013-Jun-27 17:49 UTC

head link

[LLVMdev] Proposal: extended MDString syntax

On Jun 27, 2013, at 10:12 AM, Chandler Carruth <chandlerc at google.com>
wrote:
> 
> On Thu, Jun 27, 2013 at 9:50 AM, Jakob Stoklund Olesen <stoklund at
2pi.dk> wrote:
> On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at gmail.com>
wrote:
> 
> > So inverting it so that MI contains LLVM IR instead of the other way
> > around? Then we'd need a serialization format for MI that happened
to
> > include a way of serializing LLVM IR within. From a quick "hey,
this
> > seems reasonable" the idea of embedding the MI into the IR rather
than
> > the other way around seems to make sense since we have already have
> > code to serialize the IR.
> 
> I’d suggest something based on YAML which would allow you to include IR
verbatim just by indenting it.
> 
> We can also use YAML embedded inside IR, potentially using the string
syntax Dan proposed or any other number of embedding mechanisms.
> 
> I like using YAML to represent the somewhat arbitrary datastructures of MI
so that we don't spend a lot of time inventing clever syntax for something
that has much more limited uses than the actual IR. I haven't heard anyone
really object to it.
> 
> However, I do think it's an open question as to whether to embed IR in
a MI container, or MI in an IR container. A few observations:
> 
> - No one has pointed out any really fundamental *problems* with any of the
approaches. I think both approaches can be made to work with reasonable amounts
of effort, and neither has really fundamental design problems.
> 
> - Different use cases will be more or less easy to write in different
forms. For example, Jakob's point:
> The IR module should be optional when serializing MI. The back-pointers
from MI to IR are not required, and I can imagine many useful test cases that
won’t need them.
> 
> I've heard Dan and others say exactly the opposite -- that MI should be
optional. I suspect that some test cases are more MI focused, and some are less.
But I don't see either being optional as a hard prerequisite.
Back-pointers from MI to LLVM IR is a hack that gets the job done, but it is not
good IR design. We are already seeing the usefulness of memory operands crumble
because of the stack coloring pass. Throw in something like modulo scheduling,
and they will be completely wrong for alias analysis.

MI should be allowed to evolve into a proper self-contained IR that doesn’t
depend on LLVM IR.

I don’t want to canonicalize this hack by encoding it in the file format we use
for our tests. A container format that holds LLVM IR and MI as sibling top-level
entities is much easier to gradually change towards a standalone MI IR.

Thanks,
/jakob

Micah Villmow

2013-Jun-27 19:53 UTC

head link

[LLVMdev] Proposal: extended MDString syntax

I would have to agree with Jakob's point here. I have to work on the MI IR,
and I'm not guaranteed to have corresponding LLVM-IR. Currently I have to
hack up LLVM to get this working. I would much rather have LLVM itself support a
MI IR without a need to have LLVMIR around.

Micah

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Jakob Stoklund Olesen
Sent: Thursday, June 27, 2013 10:49 AM
To: Chandler Carruth
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Proposal: extended MDString syntax


On Jun 27, 2013, at 10:12 AM, Chandler Carruth <chandlerc at google.com>
wrote:
> 
> On Thu, Jun 27, 2013 at 9:50 AM, Jakob Stoklund Olesen <stoklund at
2pi.dk> wrote:
> On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at gmail.com>
wrote:
> 
> > So inverting it so that MI contains LLVM IR instead of the other way 
> > around? Then we'd need a serialization format for MI that happened
> > to include a way of serializing LLVM IR within. From a quick
"hey,
> > this seems reasonable" the idea of embedding the MI into the IR 
> > rather than the other way around seems to make sense since we have 
> > already have code to serialize the IR.
> 
> I'd suggest something based on YAML which would allow you to include IR
verbatim just by indenting it.
> 
> We can also use YAML embedded inside IR, potentially using the string
syntax Dan proposed or any other number of embedding mechanisms.
> 
> I like using YAML to represent the somewhat arbitrary datastructures of MI
so that we don't spend a lot of time inventing clever syntax for something
that has much more limited uses than the actual IR. I haven't heard anyone
really object to it.
> 
> However, I do think it's an open question as to whether to embed IR in
a MI container, or MI in an IR container. A few observations:
> 
> - No one has pointed out any really fundamental *problems* with any of the
approaches. I think both approaches can be made to work with reasonable amounts
of effort, and neither has really fundamental design problems.
> 
> - Different use cases will be more or less easy to write in different
forms. For example, Jakob's point:
> The IR module should be optional when serializing MI. The back-pointers
from MI to IR are not required, and I can imagine many useful test cases that
won't need them.
> 
> I've heard Dan and others say exactly the opposite -- that MI should be
optional. I suspect that some test cases are more MI focused, and some are less.
But I don't see either being optional as a hard prerequisite.
Back-pointers from MI to LLVM IR is a hack that gets the job done, but it is not
good IR design. We are already seeing the usefulness of memory operands crumble
because of the stack coloring pass. Throw in something like modulo scheduling,
and they will be completely wrong for alias analysis.

MI should be allowed to evolve into a proper self-contained IR that doesn't
depend on LLVM IR.

I don't want to canonicalize this hack by encoding it in the file format we
use for our tests. A container format that holds LLVM IR and MI as sibling
top-level entities is much easier to gradually change towards a standalone MI
IR.

Thanks,
/jakob


_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Dan Gohman

2013-Jun-27 22:22 UTC

head link

[LLVMdev] Proposal: extended MDString syntax

On Thu, Jun 27, 2013 at 10:49 AM, Jakob Stoklund Olesen <stoklund at
2pi.dk>wrote:
>
> On Jun 27, 2013, at 10:12 AM, Chandler Carruth <chandlerc at
google.com>
> wrote:
>
> >
> > On Thu, Jun 27, 2013 at 9:50 AM, Jakob Stoklund Olesen <stoklund at
2pi.dk>
> wrote:
> > On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at
gmail.com>
> wrote:
> >
> > > So inverting it so that MI contains LLVM IR instead of the other
way
> > > around? Then we'd need a serialization format for MI that
happened to
> > > include a way of serializing LLVM IR within. From a quick
"hey, this
> > > seems reasonable" the idea of embedding the MI into the IR
rather than
> > > the other way around seems to make sense since we have already
have
> > > code to serialize the IR.
> >
> > I’d suggest something based on YAML which would allow you to include
IR
> verbatim just by indenting it.
> >
> > We can also use YAML embedded inside IR, potentially using the string
> syntax Dan proposed or any other number of embedding mechanisms.
> >
> > I like using YAML to represent the somewhat arbitrary datastructures
of
> MI so that we don't spend a lot of time inventing clever syntax for
> something that has much more limited uses than the actual IR. I haven't
> heard anyone really object to it.
> >
> > However, I do think it's an open question as to whether to embed
IR in a
> MI container, or MI in an IR container. A few observations:
> >
> > - No one has pointed out any really fundamental *problems* with any of
> the approaches. I think both approaches can be made to work with reasonable
> amounts of effort, and neither has really fundamental design problems.
> >
> > - Different use cases will be more or less easy to write in different
> forms. For example, Jakob's point:
> > The IR module should be optional when serializing MI. The
back-pointers
> from MI to IR are not required, and I can imagine many useful test cases
> that won’t need them.
> >
> > I've heard Dan and others say exactly the opposite -- that MI
should be
> optional. I suspect that some test cases are more MI focused, and some are
> less. But I don't see either being optional as a hard prerequisite.
>
> Back-pointers from MI to LLVM IR is a hack that gets the job done, but it
> is not good IR design. We are already seeing the usefulness of memory
> operands crumble because of the stack coloring pass. Throw in something
> like modulo scheduling, and they will be completely wrong for alias
> analysis.
>
> MI should be allowed to evolve into a proper self-contained IR that
> doesn’t depend on LLVM IR.
>
> I don’t want to canonicalize this hack by encoding it in the file format
> we use for our tests. A container format that holds LLVM IR and MI as
> sibling top-level entities is much easier to gradually change towards a
> standalone MI IR.
>
This is an interesting point. I tend to think of CodeGen as being an
analysis of LLVM IR, and while it can diverge somewhat, the ways in which
it diverges are usually constrained in some ways, and that leveraging
information already available in LLVM IR was practical. However, In a world
where CodeGen is doing things like restructuring loops, this seems less
practical.

Bob, I look forward to seeing the patches you have.

Thanks,

Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130627/cf474aa3/attachment.html>

Jim Grosbach

2013-Jun-27 23:30 UTC

head link

[LLVMdev] Proposal: extended MDString syntax

On Jun 27, 2013, at 10:49 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk>
wrote:
> 
> On Jun 27, 2013, at 10:12 AM, Chandler Carruth <chandlerc at
google.com> wrote:
> 
>> 
>> On Thu, Jun 27, 2013 at 9:50 AM, Jakob Stoklund Olesen <stoklund at
2pi.dk> wrote:
>> On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at
gmail.com> wrote:
>> 
>>> So inverting it so that MI contains LLVM IR instead of the other
way
>>> around? Then we'd need a serialization format for MI that
happened to
>>> include a way of serializing LLVM IR within. From a quick
"hey, this
>>> seems reasonable" the idea of embedding the MI into the IR
rather than
>>> the other way around seems to make sense since we have already have
>>> code to serialize the IR.
>> 
>> I’d suggest something based on YAML which would allow you to include IR
verbatim just by indenting it.
>> 
>> We can also use YAML embedded inside IR, potentially using the string
syntax Dan proposed or any other number of embedding mechanisms.
>> 
>> I like using YAML to represent the somewhat arbitrary datastructures of
MI so that we don't spend a lot of time inventing clever syntax for
something that has much more limited uses than the actual IR. I haven't
heard anyone really object to it.
>> 
>> However, I do think it's an open question as to whether to embed IR
in a MI container, or MI in an IR container. A few observations:
>> 
>> - No one has pointed out any really fundamental *problems* with any of
the approaches. I think both approaches can be made to work with reasonable
amounts of effort, and neither has really fundamental design problems.
>> 
>> - Different use cases will be more or less easy to write in different
forms. For example, Jakob's point:
>> The IR module should be optional when serializing MI. The back-pointers
from MI to IR are not required, and I can imagine many useful test cases that
won’t need them.
>> 
>> I've heard Dan and others say exactly the opposite -- that MI
should be optional. I suspect that some test cases are more MI focused, and some
are less. But I don't see either being optional as a hard prerequisite.
> 
> Back-pointers from MI to LLVM IR is a hack that gets the job done, but it
is not good IR design. We are already seeing the usefulness of memory operands
crumble because of the stack coloring pass. Throw in something like modulo
scheduling, and they will be completely wrong for alias analysis.
> 
> MI should be allowed to evolve into a proper self-contained IR that doesn’t
depend on LLVM IR.
> 
> I don’t want to canonicalize this hack by encoding it in the file format we
use for our tests. A container format that holds LLVM IR and MI as sibling
top-level entities is much easier to gradually change towards a standalone MI
IR.
I'm on my phone and so can't go into too much depth right now, but FWIW,
I strongly agree with this.

Conceptually, IR and MI are distinct, and we should design to keep their
coupling as light as possible and work to lighten the coupling that already
exists.

Jim> 
> Thanks,
> /jakob
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Jun 2013 - [LLVMdev] Proposal: extended MDString syntax

[LLVMdev] Proposal: extended MDString syntax

[LLVMdev] Proposal: extended MDString syntax

[LLVMdev] Proposal: extended MDString syntax

[LLVMdev] Proposal: extended MDString syntax

[LLVMdev] Proposal: extended MDString syntax

Reasonably Related Threads