On Thu, Jun 27, 2013 at 9:50 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk>wrote:> On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at gmail.com> wrote: > > > So inverting it so that MI contains LLVM IR instead of the other way > > around? Then we'd need a serialization format for MI that happened to > > include a way of serializing LLVM IR within. From a quick "hey, this > > seems reasonable" the idea of embedding the MI into the IR rather than > > the other way around seems to make sense since we have already have > > code to serialize the IR. > > I’d suggest something based on YAML which would allow you to include IR > verbatim just by indenting it. >We can also use YAML embedded inside IR, potentially using the string syntax Dan proposed or any other number of embedding mechanisms. I like using YAML to represent the somewhat arbitrary datastructures of MI so that we don't spend a lot of time inventing clever syntax for something that has much more limited uses than the actual IR. I haven't heard anyone really object to it. However, I do think it's an open question as to whether to embed IR in a MI container, or MI in an IR container. A few observations: - No one has pointed out any really fundamental *problems* with any of the approaches. I think both approaches can be made to work with reasonable amounts of effort, and neither has really fundamental design problems. - Different use cases will be more or less easy to write in different forms. For example, Jakob's point:> The IR module should be optional when serializing MI. The back-pointers > from MI to IR are not required, and I can imagine many useful test cases > that won’t need them.I've heard Dan and others say exactly the opposite -- that MI should be optional. I suspect that some test cases are more MI focused, and some are less. But I don't see either being optional as a hard prerequisite. So, here is my concrete suggestion: if all of these approaches seem to work and there aren't huge downsides but only reasonable tradeoffs, let the folks writing the patches make the decision. At the moment that appears to be Dan and maybe Bob. Is there a reason to not let them pick the design they want to make forward progress with and run with it? I think that will be much more productive and get us back to the important part: testing MI-level passes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130627/75768afd/attachment.html>
On Jun 27, 2013, at 10:12 AM, Chandler Carruth <chandlerc at google.com> wrote:> > On Thu, Jun 27, 2013 at 9:50 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: > On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at gmail.com> wrote: > > > So inverting it so that MI contains LLVM IR instead of the other way > > around? Then we'd need a serialization format for MI that happened to > > include a way of serializing LLVM IR within. From a quick "hey, this > > seems reasonable" the idea of embedding the MI into the IR rather than > > the other way around seems to make sense since we have already have > > code to serialize the IR. > > I’d suggest something based on YAML which would allow you to include IR verbatim just by indenting it. > > We can also use YAML embedded inside IR, potentially using the string syntax Dan proposed or any other number of embedding mechanisms. > > I like using YAML to represent the somewhat arbitrary datastructures of MI so that we don't spend a lot of time inventing clever syntax for something that has much more limited uses than the actual IR. I haven't heard anyone really object to it. > > However, I do think it's an open question as to whether to embed IR in a MI container, or MI in an IR container. A few observations: > > - No one has pointed out any really fundamental *problems* with any of the approaches. I think both approaches can be made to work with reasonable amounts of effort, and neither has really fundamental design problems. > > - Different use cases will be more or less easy to write in different forms. For example, Jakob's point: > The IR module should be optional when serializing MI. The back-pointers from MI to IR are not required, and I can imagine many useful test cases that won’t need them. > > I've heard Dan and others say exactly the opposite -- that MI should be optional. I suspect that some test cases are more MI focused, and some are less. But I don't see either being optional as a hard prerequisite.Back-pointers from MI to LLVM IR is a hack that gets the job done, but it is not good IR design. We are already seeing the usefulness of memory operands crumble because of the stack coloring pass. Throw in something like modulo scheduling, and they will be completely wrong for alias analysis. MI should be allowed to evolve into a proper self-contained IR that doesn’t depend on LLVM IR. I don’t want to canonicalize this hack by encoding it in the file format we use for our tests. A container format that holds LLVM IR and MI as sibling top-level entities is much easier to gradually change towards a standalone MI IR. Thanks, /jakob
I would have to agree with Jakob's point here. I have to work on the MI IR, and I'm not guaranteed to have corresponding LLVM-IR. Currently I have to hack up LLVM to get this working. I would much rather have LLVM itself support a MI IR without a need to have LLVMIR around. Micah -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Jakob Stoklund Olesen Sent: Thursday, June 27, 2013 10:49 AM To: Chandler Carruth Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Proposal: extended MDString syntax On Jun 27, 2013, at 10:12 AM, Chandler Carruth <chandlerc at google.com> wrote:> > On Thu, Jun 27, 2013 at 9:50 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: > On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at gmail.com> wrote: > > > So inverting it so that MI contains LLVM IR instead of the other way > > around? Then we'd need a serialization format for MI that happened > > to include a way of serializing LLVM IR within. From a quick "hey, > > this seems reasonable" the idea of embedding the MI into the IR > > rather than the other way around seems to make sense since we have > > already have code to serialize the IR. > > I'd suggest something based on YAML which would allow you to include IR verbatim just by indenting it. > > We can also use YAML embedded inside IR, potentially using the string syntax Dan proposed or any other number of embedding mechanisms. > > I like using YAML to represent the somewhat arbitrary datastructures of MI so that we don't spend a lot of time inventing clever syntax for something that has much more limited uses than the actual IR. I haven't heard anyone really object to it. > > However, I do think it's an open question as to whether to embed IR in a MI container, or MI in an IR container. A few observations: > > - No one has pointed out any really fundamental *problems* with any of the approaches. I think both approaches can be made to work with reasonable amounts of effort, and neither has really fundamental design problems. > > - Different use cases will be more or less easy to write in different forms. For example, Jakob's point: > The IR module should be optional when serializing MI. The back-pointers from MI to IR are not required, and I can imagine many useful test cases that won't need them. > > I've heard Dan and others say exactly the opposite -- that MI should be optional. I suspect that some test cases are more MI focused, and some are less. But I don't see either being optional as a hard prerequisite.Back-pointers from MI to LLVM IR is a hack that gets the job done, but it is not good IR design. We are already seeing the usefulness of memory operands crumble because of the stack coloring pass. Throw in something like modulo scheduling, and they will be completely wrong for alias analysis. MI should be allowed to evolve into a proper self-contained IR that doesn't depend on LLVM IR. I don't want to canonicalize this hack by encoding it in the file format we use for our tests. A container format that holds LLVM IR and MI as sibling top-level entities is much easier to gradually change towards a standalone MI IR. Thanks, /jakob _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Thu, Jun 27, 2013 at 10:49 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk>wrote:> > On Jun 27, 2013, at 10:12 AM, Chandler Carruth <chandlerc at google.com> > wrote: > > > > > On Thu, Jun 27, 2013 at 9:50 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk> > wrote: > > On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at gmail.com> > wrote: > > > > > So inverting it so that MI contains LLVM IR instead of the other way > > > around? Then we'd need a serialization format for MI that happened to > > > include a way of serializing LLVM IR within. From a quick "hey, this > > > seems reasonable" the idea of embedding the MI into the IR rather than > > > the other way around seems to make sense since we have already have > > > code to serialize the IR. > > > > I’d suggest something based on YAML which would allow you to include IR > verbatim just by indenting it. > > > > We can also use YAML embedded inside IR, potentially using the string > syntax Dan proposed or any other number of embedding mechanisms. > > > > I like using YAML to represent the somewhat arbitrary datastructures of > MI so that we don't spend a lot of time inventing clever syntax for > something that has much more limited uses than the actual IR. I haven't > heard anyone really object to it. > > > > However, I do think it's an open question as to whether to embed IR in a > MI container, or MI in an IR container. A few observations: > > > > - No one has pointed out any really fundamental *problems* with any of > the approaches. I think both approaches can be made to work with reasonable > amounts of effort, and neither has really fundamental design problems. > > > > - Different use cases will be more or less easy to write in different > forms. For example, Jakob's point: > > The IR module should be optional when serializing MI. The back-pointers > from MI to IR are not required, and I can imagine many useful test cases > that won’t need them. > > > > I've heard Dan and others say exactly the opposite -- that MI should be > optional. I suspect that some test cases are more MI focused, and some are > less. But I don't see either being optional as a hard prerequisite. > > Back-pointers from MI to LLVM IR is a hack that gets the job done, but it > is not good IR design. We are already seeing the usefulness of memory > operands crumble because of the stack coloring pass. Throw in something > like modulo scheduling, and they will be completely wrong for alias > analysis. > > MI should be allowed to evolve into a proper self-contained IR that > doesn’t depend on LLVM IR. > > I don’t want to canonicalize this hack by encoding it in the file format > we use for our tests. A container format that holds LLVM IR and MI as > sibling top-level entities is much easier to gradually change towards a > standalone MI IR. >This is an interesting point. I tend to think of CodeGen as being an analysis of LLVM IR, and while it can diverge somewhat, the ways in which it diverges are usually constrained in some ways, and that leveraging information already available in LLVM IR was practical. However, In a world where CodeGen is doing things like restructuring loops, this seems less practical. Bob, I look forward to seeing the patches you have. Thanks, Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130627/cf474aa3/attachment.html>
On Jun 27, 2013, at 10:49 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote:> > On Jun 27, 2013, at 10:12 AM, Chandler Carruth <chandlerc at google.com> wrote: > >> >> On Thu, Jun 27, 2013 at 9:50 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: >> On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at gmail.com> wrote: >> >>> So inverting it so that MI contains LLVM IR instead of the other way >>> around? Then we'd need a serialization format for MI that happened to >>> include a way of serializing LLVM IR within. From a quick "hey, this >>> seems reasonable" the idea of embedding the MI into the IR rather than >>> the other way around seems to make sense since we have already have >>> code to serialize the IR. >> >> I’d suggest something based on YAML which would allow you to include IR verbatim just by indenting it. >> >> We can also use YAML embedded inside IR, potentially using the string syntax Dan proposed or any other number of embedding mechanisms. >> >> I like using YAML to represent the somewhat arbitrary datastructures of MI so that we don't spend a lot of time inventing clever syntax for something that has much more limited uses than the actual IR. I haven't heard anyone really object to it. >> >> However, I do think it's an open question as to whether to embed IR in a MI container, or MI in an IR container. A few observations: >> >> - No one has pointed out any really fundamental *problems* with any of the approaches. I think both approaches can be made to work with reasonable amounts of effort, and neither has really fundamental design problems. >> >> - Different use cases will be more or less easy to write in different forms. For example, Jakob's point: >> The IR module should be optional when serializing MI. The back-pointers from MI to IR are not required, and I can imagine many useful test cases that won’t need them. >> >> I've heard Dan and others say exactly the opposite -- that MI should be optional. I suspect that some test cases are more MI focused, and some are less. But I don't see either being optional as a hard prerequisite. > > Back-pointers from MI to LLVM IR is a hack that gets the job done, but it is not good IR design. We are already seeing the usefulness of memory operands crumble because of the stack coloring pass. Throw in something like modulo scheduling, and they will be completely wrong for alias analysis. > > MI should be allowed to evolve into a proper self-contained IR that doesn’t depend on LLVM IR. > > I don’t want to canonicalize this hack by encoding it in the file format we use for our tests. A container format that holds LLVM IR and MI as sibling top-level entities is much easier to gradually change towards a standalone MI IR.I'm on my phone and so can't go into too much depth right now, but FWIW, I strongly agree with this. Conceptually, IR and MI are distinct, and we should design to keep their coupling as light as possible and work to lighten the coupling that already exists. Jim> > Thanks, > /jakob > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev