On Wed, Jun 26, 2013 at 3:59 PM, Nadav Rotem <nrotem at apple.com> wrote:> > On Jun 26, 2013, at 3:51 PM, Chandler Carruth <chandlerc at google.com> wrote: > > Can you suggest an alternative solution? Can you describe why you don't > think metadata is the right container? This alone isn't really helpful at > moving us toward something that there has been widespread agreement LLVM > needs. > > > Hi Chandler, > > Sure, we can talk about serializing MF. But the discussion should focus on > serializing MF, and not multi-line metadata support, which is only one of > the possible solutions. I understand the problem that Dan mentioned (that > MF references IR), and I am sure that there are other problems that he did > not mention. I would be happy to hear more about other solutions that you > considered and other problems that you ran into. Have you considered using > a new format that embeds LLVM-IR ? >(Note, this is the first I've heard of this plan and just figured it out myself) So inverting it so that MI contains LLVM IR instead of the other way around? Then we'd need a serialization format for MI that happened to include a way of serializing LLVM IR within. From a quick "hey, this seems reasonable" the idea of embedding the MI into the IR rather than the other way around seems to make sense since we have already have code to serialize the IR. The only other idea I've seen was an intern project that really didn't go very far a few years ago of using *AML (one of them, I can't recall which). I think Bob had some idea of finishing the project, but I'm not sure where it's going. Do you have any other ideas or some ideas as to why you'd prefer one direction rather than the other? -eric
On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at gmail.com> wrote:> (Note, this is the first I've heard of this plan and just figured it out myself)Yes, this is also the first time I heard about this and I haven’t had a chance to think about this problem too deeply.> > So inverting it so that MI contains LLVM IR instead of the other way > around? Then we'd need a serialization format for MI that happened to > include a way of serializing LLVM IR within. From a quick "hey, this > seems reasonable" the idea of embedding the MI into the IR rather than > the other way around seems to make sense since we have already have > code to serialize the IR. > > The only other idea I've seen was an intern project that really didn't > go very far a few years ago of using *AML (one of them, I can't recall > which). I think Bob had some idea of finishing the project, but I'm > not sure where it's going. > > Do you have any other ideas or some ideas as to why you'd prefer one > direction rather than the other? > > -ericI think that the two alternatives that are obvious are for the MF to contain the IR, or for the IR to contain the MF. Alternatively, they can live in parallel and the MF may reference the IR. I am not sure what is the right approach here, but my gut feeling is that metadata is not necessarily the right container for MF. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130626/833629e2/attachment.html>
On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at gmail.com> wrote:> On Wed, Jun 26, 2013 at 3:59 PM, Nadav Rotem <nrotem at apple.com> wrote: >> >> On Jun 26, 2013, at 3:51 PM, Chandler Carruth <chandlerc at google.com> wrote: >> >> Can you suggest an alternative solution? Can you describe why you don't >> think metadata is the right container? This alone isn't really helpful at >> moving us toward something that there has been widespread agreement LLVM >> needs. >> >> >> Hi Chandler, >> >> Sure, we can talk about serializing MF. But the discussion should focus on >> serializing MF, and not multi-line metadata support, which is only one of >> the possible solutions. I understand the problem that Dan mentioned (that >> MF references IR), and I am sure that there are other problems that he did >> not mention. I would be happy to hear more about other solutions that you >> considered and other problems that you ran into. Have you considered using >> a new format that embeds LLVM-IR ? >> > > (Note, this is the first I've heard of this plan and just figured it out myself) > > So inverting it so that MI contains LLVM IR instead of the other way > around? Then we'd need a serialization format for MI that happened to > include a way of serializing LLVM IR within. From a quick "hey, this > seems reasonable" the idea of embedding the MI into the IR rather than > the other way around seems to make sense since we have already have > code to serialize the IR. > > The only other idea I've seen was an intern project that really didn't > go very far a few years ago of using *AML (one of them, I can't recall > which). I think Bob had some idea of finishing the project, but I'm > not sure where it's going. > > Do you have any other ideas or some ideas as to why you'd prefer one > direction rather than the other?Bin Zeng worked on a project as an intern last summer to serialize machine functions to yaml. At the time, we were unable to commit it to trunk because we were waiting for Nick's yamlio work to get committed. I've still got his patches and plan to commit them whenever I get a chance. I was also considering having another intern pick up that project where it left off. The approach is perhaps similar to what Dan is proposing, just flipped around. In one scheme, the top-level container is yaml and the IR is embedded within it along with the machine function stuff. In the other, the IR is the top-level container and the machine functions are embedded as metadata. I prefer the yaml approach. I'd be glad to reprioritize contributing the rest of Bin's patches to make those available sooner rather than later. The more interesting part, with either scheme, is how to represent the machine functions. We definitely want something that is readable but still easy to parse.
> > > I think that the two alternatives that are obvious are for the MF to contain > the IR, or for the IR to contain the MF. Alternatively, they can live in > parallel and the MF may reference the IR. I am not sure what is the right > approach here, but my gut feeling is that metadata is not necessarily the > right container for MF.Off the cuff I'd think that IR containing MF seems most reasonable and the use of metadata to contain it seems to be good from two perspectives I think: a) it already exists, b) oddly enough that we could get rid of the metadata and still have a valid module/compilation unit seems like it might be interestingly useful, but I'm not sure what uses there are off the top of my head. That said, I really have no preference either way, just idle speculation. Probably similar to you since we've both not thought deeply upon it :) The MDString stuff does seem like it might be useful in general if we'd like to have that though. -eric
> > Bin Zeng worked on a project as an intern last summer to serialize machine functions to yaml. At the time, we were unable to commit it to trunk because we were waiting for Nick's yamlio work to get committed. I've still got his patches and plan to commit them whenever I get a chance. I was also considering having another intern pick up that project where it left off. > > The approach is perhaps similar to what Dan is proposing, just flipped around. In one scheme, the top-level container is yaml and the IR is embedded within it along with the machine function stuff. In the other, the IR is the top-level container and the machine functions are embedded as metadata. I prefer the yaml approach. >Any reason? I remember the project, of course, but didn't really have a good feel on any of the design decisions other than "hey, there's this yaml thing". That said I don't believe I was in on the design discussion in the first place.> I'd be glad to reprioritize contributing the rest of Bin's patches to make those available sooner rather than later. The more interesting part, with either scheme, is how to represent the machine functions. We definitely want something that is readable but still easy to parse.At least posting them with some description and a design for how it works and the tradeoffs could be goodness. Then they'd be out there to look at and discussed. -eric
On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at gmail.com> wrote:> So inverting it so that MI contains LLVM IR instead of the other way > around? Then we'd need a serialization format for MI that happened to > include a way of serializing LLVM IR within. From a quick "hey, this > seems reasonable" the idea of embedding the MI into the IR rather than > the other way around seems to make sense since we have already have > code to serialize the IR.I’d suggest something based on YAML which would allow you to include IR verbatim just by indenting it. The IR module should be optional when serializing MI. The back-pointers from MI to IR are not required, and I can imagine many useful test cases that won’t need them. module: | define void @linkit(i8* %source) #0 { entry: %.b243 = load i1* @Pflag, align 1 %cond = select i1 %.b243, i32 (i8*, %struct.stat.6.13.20.64*)* @lstat, i32 (i8*, %struct.stat.6.13.20.64*)* @stat %call = call signext i32 %cond(i8* %source, %struct.stat.6.13.20.64* undef) #2 ret void } @Pflag = external unnamed_addr global i1 declare signext i32 @lstat(i8* nocapture, %struct.stat.6.13.20.64* nocapture) #1 declare signext i32 @stat(i8* nocapture, %struct.stat.6.13.20.64* nocapture) #1 mi: | BB#0: derived from LLVM BB %entry Live Ins: %I0 %O6<def> = SAVEri %O6, -176 %I1<def> = SETHIi <ga:@Pflag>[TF=3] %I1<def> = ADDri %I1<kill>, <ga:@Pflag>[TF=4] %I1<def> = SLLXri %I1<kill>, 12 %I2<def> = LDUBri %I1<kill>, <ga:@Pflag>[TF=5]; mem:LD1[@Pflag] %I1<def> = SETHIi <ga:@stat>[TF=3] %I1<def> = ADDri %I1<kill>, <ga:@stat>[TF=4] %I1<def> = SLLXri %I1<kill>, 12 %I1<def> = ADDri %I1<kill>, <ga:@stat>[TF=5] %I3<def> = SETHIi <ga:@lstat>[TF=3] %I3<def> = ADDri %I3<kill>, <ga:@lstat>[TF=4] %I3<def> = SLLXri %I3<kill>, 12 %I3<def> = ADDri %I3<kill>, <ga:@lstat>[TF=5] CMPri %I2<kill>, 0, %ICC<imp-def> %I1<def,tied2> = MOVXCCrr %I3<kill>, %I1<kill,tied0>, 9, %ICC<imp-use,kill> JMPLrr %I1<kill>, %G0, %O0<kill>, %O1<undef>, %O0<imp-def,dead>, %O1<imp-def,dead>, %ICC<imp-def,dead>, %O6<imp-use>, ... %O0<def> = ORrr %G0, %I0<kill> RET 8 %G0<def> = RESTORErr %G0, %G0 We could also use more YAML structure to represent MI functions and basic blocks, if needed. Thanks, /jakob
On Thu, Jun 27, 2013 at 9:50 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk>wrote:> On Jun 26, 2013, at 4:18 PM, Eric Christopher <echristo at gmail.com> wrote: > > > So inverting it so that MI contains LLVM IR instead of the other way > > around? Then we'd need a serialization format for MI that happened to > > include a way of serializing LLVM IR within. From a quick "hey, this > > seems reasonable" the idea of embedding the MI into the IR rather than > > the other way around seems to make sense since we have already have > > code to serialize the IR. > > I’d suggest something based on YAML which would allow you to include IR > verbatim just by indenting it. >We can also use YAML embedded inside IR, potentially using the string syntax Dan proposed or any other number of embedding mechanisms. I like using YAML to represent the somewhat arbitrary datastructures of MI so that we don't spend a lot of time inventing clever syntax for something that has much more limited uses than the actual IR. I haven't heard anyone really object to it. However, I do think it's an open question as to whether to embed IR in a MI container, or MI in an IR container. A few observations: - No one has pointed out any really fundamental *problems* with any of the approaches. I think both approaches can be made to work with reasonable amounts of effort, and neither has really fundamental design problems. - Different use cases will be more or less easy to write in different forms. For example, Jakob's point:> The IR module should be optional when serializing MI. The back-pointers > from MI to IR are not required, and I can imagine many useful test cases > that won’t need them.I've heard Dan and others say exactly the opposite -- that MI should be optional. I suspect that some test cases are more MI focused, and some are less. But I don't see either being optional as a hard prerequisite. So, here is my concrete suggestion: if all of these approaches seem to work and there aren't huge downsides but only reasonable tradeoffs, let the folks writing the patches make the decision. At the moment that appears to be Dan and maybe Bob. Is there a reason to not let them pick the design they want to make forward progress with and run with it? I think that will be much more productive and get us back to the important part: testing MI-level passes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130627/75768afd/attachment.html>