thr3ads.net - llvm dev - [LLVMdev] Proposal: extended MDString syntax [Jun 2013]

If this information is useful, please help other people find it:
Share via:

Bob Wilson

2013-Jun-27 05:43 UTC

[LLVMdev] Proposal: extended MDString syntax

On Jun 26, 2013, at 4:36 PM, Chandler Carruth <chandlerc at google.com>
wrote:
> 
> On Wed, Jun 26, 2013 at 4:30 PM, Eric Christopher <echristo at
gmail.com> wrote:
> Off the cuff I'd think that IR containing MF seems most reasonable and
> the use of metadata to contain it seems to be good from two
> perspectives I think:
> 
> a) it already exists, 
> b) oddly enough that we could get rid of the metadata and still have a
> valid module/compilation unit seems like it might be interestingly
> useful, but I'm not sure what uses there are off the top of my head.
> 
> I'll give the reason why I like this having just thought about it a
while:
> 
> I think of this as a pre-lowered hint. IE, take some IR, and give a hint to
the code generator to lower like this over here. I see a few benefits of this
model:
> 
> - It makes it reasonably easy to only specify the MI for the bit you really
are trying to test. You can let the normal lowering process handle any other
bits. I think this will help keep test cases small and reasonable.
> 
> - It makes it easy to re-baseline when the code generator changes but the
changes are acceptable -- strip metadata and run it through the existing
pipeline.
> 
> - It has the potential to be "incomplete" or of varying degrees
of completeness which I think will be useful in testing different layers of the
system... but Dan probably has more/better thoughts on this front than I do.
> 
> 
> The one thing I don't really like about the reversed model of MI
containing IR is that now the MI model has to be "complete", so we
have to invent what that means. I'm not really interested in this outside of
generating test cases, so anything that simplifies the space of what we have to
design *really* appeals to me.
I don’t have a strong opinion either way.

I don’t understand your comment about the MI model needing to be “complete”. 
The yaml approach was not “MI containing IR”.  In fact, the initial
implementation doesn’t have good support for serializing machine instructions,
but it works great to IR-level passes run by llc, e.g., codegenprepare.  The
yaml file is just a way of collecting the various kinds of information needed
for that, and you can omit the machine instructions entirely if you want to
serialize after an IR-level pass.  I think all of the benefits you mention for
using metadata could apply just as well to using yaml — it’s just a matter of
how you stuff the data into a file.

Some other things to keep in mind;

- There are a number of different data structures that will need to be
serialized to really make this work.  Besides the IR and the
MachineInstructions, there are various data structures in MachineFunctions, some
of which are target-specific.  Yaml works well for that because it provides a
nicely structured way of organizing that data.  The same could be done with
metadata, though.

- One idea that Bin implemented last summer was to stash the last pass in the
yaml.  Unlike IR-level passes, llc has more constraints on the order in which it
runs passes.  We decided to just accept that limitation and assume a fixed order
for the passes. We added the -stop-after option to specify where in the pass
sequence to stop and serialize the code out to a yaml file.  By including the
name of the -stop-after pass in the yaml output, we automatically know where to
start up again when processing a yaml input.  There are some cases where passes
are run more than once, and I don’t think we had a good solution for handling
that.

I’m curious to find out if you have ideas for how to serialize the actual
machine instructions.  That’s where it really gets interesting, IMO.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130626/b057802e/attachment.html>

Micah Villmow

2013-Jun-27 15:13 UTC

head link

[LLVMdev] Proposal: extended MDString syntax

One question I have about this, what is the use case that is being targeted
here?

Micah

From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Bob Wilson
Sent: Wednesday, June 26, 2013 10:43 PM
To: Chandler Carruth
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Proposal: extended MDString syntax

On Jun 26, 2013, at 4:36 PM, Chandler Carruth <chandlerc at
google.com<mailto:chandlerc at google.com>> wrote:

On Wed, Jun 26, 2013 at 4:30 PM, Eric Christopher <echristo at
gmail.com<mailto:echristo at gmail.com>> wrote:
Off the cuff I'd think that IR containing MF seems most reasonable and
the use of metadata to contain it seems to be good from two
perspectives I think:

a) it already exists,
b) oddly enough that we could get rid of the metadata and still have a
valid module/compilation unit seems like it might be interestingly
useful, but I'm not sure what uses there are off the top of my head.

I'll give the reason why I like this having just thought about it a while:

I think of this as a pre-lowered hint. IE, take some IR, and give a hint to the
code generator to lower like this over here. I see a few benefits of this model:

- It makes it reasonably easy to only specify the MI for the bit you really are
trying to test. You can let the normal lowering process handle any other bits. I
think this will help keep test cases small and reasonable.

- It makes it easy to re-baseline when the code generator changes but the
changes are acceptable -- strip metadata and run it through the existing
pipeline.

- It has the potential to be "incomplete" or of varying degrees of
completeness which I think will be useful in testing different layers of the
system... but Dan probably has more/better thoughts on this front than I do.

The one thing I don't really like about the reversed model of MI containing
IR is that now the MI model has to be "complete", so we have to invent
what that means. I'm not really interested in this outside of generating
test cases, so anything that simplifies the space of what we have to design
*really* appeals to me.

I don't have a strong opinion either way.

I don't understand your comment about the MI model needing to be
"complete".  The yaml approach was not "MI containing IR". 
In fact, the initial implementation doesn't have good support for
serializing machine instructions, but it works great to IR-level passes run by
llc, e.g., codegenprepare.  The yaml file is just a way of collecting the
various kinds of information needed for that, and you can omit the machine
instructions entirely if you want to serialize after an IR-level pass.  I think
all of the benefits you mention for using metadata could apply just as well to
using yaml - it's just a matter of how you stuff the data into a file.

Some other things to keep in mind;

- There are a number of different data structures that will need to be
serialized to really make this work.  Besides the IR and the
MachineInstructions, there are various data structures in MachineFunctions, some
of which are target-specific.  Yaml works well for that because it provides a
nicely structured way of organizing that data.  The same could be done with
metadata, though.

- One idea that Bin implemented last summer was to stash the last pass in the
yaml.  Unlike IR-level passes, llc has more constraints on the order in which it
runs passes.  We decided to just accept that limitation and assume a fixed order
for the passes. We added the -stop-after option to specify where in the pass
sequence to stop and serialize the code out to a yaml file.  By including the
name of the -stop-after pass in the yaml output, we automatically know where to
start up again when processing a yaml input.  There are some cases where passes
are run more than once, and I don't think we had a good solution for
handling that.

I'm curious to find out if you have ideas for how to serialize the actual
machine instructions.  That's where it really gets interesting, IMO.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130627/c16ecb6b/attachment.html>

Bob Wilson

2013-Jun-27 16:05 UTC

head link

[LLVMdev] Proposal: extended MDString syntax

There are a variety of potential uses, but at a minimum, we would like to be
able to run individual code-gen passes for debugging and unit testing, just like
we do for IR-level passes.

On Jun 27, 2013, at 8:13 AM, Micah Villmow <micah.villmow at
smachines.com> wrote:
> One question I have about this, what is the use case that is being targeted
here?
>  
> Micah
>  
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at
cs.uiuc.edu] On Behalf Of Bob Wilson
> Sent: Wednesday, June 26, 2013 10:43 PM
> To: Chandler Carruth
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] Proposal: extended MDString syntax
>  
>  
> On Jun 26, 2013, at 4:36 PM, Chandler Carruth <chandlerc at
google.com> wrote:
> 
> 
> 
> On Wed, Jun 26, 2013 at 4:30 PM, Eric Christopher <echristo at
gmail.com> wrote:
> Off the cuff I'd think that IR containing MF seems most reasonable and
> the use of metadata to contain it seems to be good from two
> perspectives I think:
> 
> a) it already exists, 
> b) oddly enough that we could get rid of the metadata and still have a
> valid module/compilation unit seems like it might be interestingly
> useful, but I'm not sure what uses there are off the top of my head.
> 
> I'll give the reason why I like this having just thought about it a
while:
>  
> I think of this as a pre-lowered hint. IE, take some IR, and give a hint to
the code generator to lower like this over here. I see a few benefits of this
model:
>  
> - It makes it reasonably easy to only specify the MI for the bit you really
are trying to test. You can let the normal lowering process handle any other
bits. I think this will help keep test cases small and reasonable.
>  
> - It makes it easy to re-baseline when the code generator changes but the
changes are acceptable -- strip metadata and run it through the existing
pipeline.
>  
> - It has the potential to be "incomplete" or of varying degrees
of completeness which I think will be useful in testing different layers of the
system... but Dan probably has more/better thoughts on this front than I do.
>  
>  
> The one thing I don't really like about the reversed model of MI
containing IR is that now the MI model has to be "complete", so we
have to invent what that means. I'm not really interested in this outside of
generating test cases, so anything that simplifies the space of what we have to
design *really* appeals to me.
>  
> I don’t have a strong opinion either way.
>  
> I don’t understand your comment about the MI model needing to be
“complete”.  The yaml approach was not “MI containing IR”.  In fact, the initial
implementation doesn’t have good support for serializing machine instructions,
but it works great to IR-level passes run by llc, e.g., codegenprepare.  The
yaml file is just a way of collecting the various kinds of information needed
for that, and you can omit the machine instructions entirely if you want to
serialize after an IR-level pass.  I think all of the benefits you mention for
using metadata could apply just as well to using yaml — it’s just a matter of
how you stuff the data into a file.
>  
> Some other things to keep in mind;
>  
> - There are a number of different data structures that will need to be
serialized to really make this work.  Besides the IR and the
MachineInstructions, there are various data structures in MachineFunctions, some
of which are target-specific.  Yaml works well for that because it provides a
nicely structured way of organizing that data.  The same could be done with
metadata, though.
>  
> - One idea that Bin implemented last summer was to stash the last pass in
the yaml.  Unlike IR-level passes, llc has more constraints on the order in
which it runs passes.  We decided to just accept that limitation and assume a
fixed order for the passes. We added the -stop-after option to specify where in
the pass sequence to stop and serialize the code out to a yaml file.  By
including the name of the -stop-after pass in the yaml output, we automatically
know where to start up again when processing a yaml input.  There are some cases
where passes are run more than once, and I don’t think we had a good solution
for handling that.
>  
> I’m curious to find out if you have ideas for how to serialize the actual
machine instructions.  That’s where it really gets interesting, IMO.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130627/4ac70d95/attachment.html>

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Jun 2013 - [LLVMdev] Proposal: extended MDString syntax

[LLVMdev] Proposal: extended MDString syntax

[LLVMdev] Proposal: extended MDString syntax

[LLVMdev] Proposal: extended MDString syntax

Maybe Matching Threads