Duncan P. N. Exon Smith
2014-Nov-10  01:02 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
TL;DR: If you use metadata (especially if it's out-of-tree), check the
numbered list of lost functionality below to see whether I'm trying to
break your compiler permanently.
In response to my recent commits (e.g., [1]) that changed API from
`MDNode` to `Value`, Eric had a really interesting idea [2] -- split
metadata entirely from the `Value` hierarchy, and drop general support
for metadata RAUW.
After hacking together some example code, this seems overwhelmingly to
me like the right direction.  See the attached metadata-v2.patch for my
sketch of what the current metadata primitives might look like in the
new hierarchy (includes LLVMContextImpl uniquing support).
The initial motivation was to avoid the API weaknesses caused by having
non-`MDNode` metadata that inherits from `Value`.  In particular,
instead of changing API from `MDNode` to `Value`, change it to a base
class called `Metadata`, which sheds the underused and expensive `Value`
base class entirely.
The implications are broader: an enormous reduction in complexity (and
overhead) for metadata.
This change implies minor (major?) loss of functionality for metadata,
but Eric and I suspect that the only hard-to-fix user is debug info
itself, whose IR infrastructure I'm rewriting anyway.
Here is what we'd lose:
 1. No more arbitrary RAUW of metadata.
    While we'd keep support for RAUW of temporary MDNodes for use as
    forward declarations (when parsing assembly or constructing cyclic
    graphs), drop RAUW support for all other metadata.
    Note that we'd also keep support for RAUW of `Value` operands of
    metadata.
    If the RAUW of an operand causes a uniquing collision, uniquing
    would be dropped for that node.  This matches the current behaviour
    when an operand goes to null.
    Upgrade path: none.
 2. No more function-local metadata.
    AFAICT, function-local metadata is *only* used for indirect
    references to instructions and arguments in `@llvm.dbg.value` and
    `@llvm.dbg.declare` intrinsics.  The first argument of the following
    is an example:
        call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
                                  metadata !1)
    Note that the debug info people uniformly seem to dislike the status
    quo, since it's awkward to get from a `Value` to the corresponding
    intrinsic.
    Upgrade path: Instead of using an intrinsic that references a
    function-local value through an `MDNode`, attach metadata to the
    corresponding argument or instruction, or to the terminating
    instruction of the basic block.  (This requires new support for
    attaching metadata to function arguments, which I'll have to add for
    debug info anyway.)
Is this going to break your compiler?  How?  Why is your use case worth
supporting?
[1]:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141103/242667.html
    "r221167 - IR: MDNode => Value:
Instruction::getAllMetadataOtherThanDebugLoc()"
[2]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078581.html
    "Re: First-class debug info IR: MDLocation"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: metadata-v2.patch
Type: application/octet-stream
Size: 31022 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141109/4d4fffcf/attachment.obj>
Chandler Carruth
2014-Nov-10  01:49 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
FWIW, this completely addresses my only ill feelings about the changes you were making. I really like it. I might bikeshed some of the names, but whatever. I would suggest maybe augmenting some of the doxygen comments to help show the specific use case that motivates the node? You already have this in a few places and it really helped me skimming the code. In particular, there are a bunch of very similar nodes around the tracking, temp, uniquable, etc. and I think it'll be important to clearly distinguish each use case that these are designed to address. Also just want to say thanks for diving into the design and finding something (at least, I'm hoping!) works even better. On Sun, Nov 9, 2014 at 7:02 PM, Duncan P. N. Exon Smith < dexonsmith at apple.com> wrote:> TL;DR: If you use metadata (especially if it's out-of-tree), check the > numbered list of lost functionality below to see whether I'm trying to > break your compiler permanently. > > In response to my recent commits (e.g., [1]) that changed API from > `MDNode` to `Value`, Eric had a really interesting idea [2] -- split > metadata entirely from the `Value` hierarchy, and drop general support > for metadata RAUW. > > After hacking together some example code, this seems overwhelmingly to > me like the right direction. See the attached metadata-v2.patch for my > sketch of what the current metadata primitives might look like in the > new hierarchy (includes LLVMContextImpl uniquing support). > > The initial motivation was to avoid the API weaknesses caused by having > non-`MDNode` metadata that inherits from `Value`. In particular, > instead of changing API from `MDNode` to `Value`, change it to a base > class called `Metadata`, which sheds the underused and expensive `Value` > base class entirely. > > The implications are broader: an enormous reduction in complexity (and > overhead) for metadata. > > This change implies minor (major?) loss of functionality for metadata, > but Eric and I suspect that the only hard-to-fix user is debug info > itself, whose IR infrastructure I'm rewriting anyway. > > Here is what we'd lose: > > 1. No more arbitrary RAUW of metadata. > > While we'd keep support for RAUW of temporary MDNodes for use as > forward declarations (when parsing assembly or constructing cyclic > graphs), drop RAUW support for all other metadata. > > Note that we'd also keep support for RAUW of `Value` operands of > metadata. > > If the RAUW of an operand causes a uniquing collision, uniquing > would be dropped for that node. This matches the current behaviour > when an operand goes to null. > > Upgrade path: none. > > 2. No more function-local metadata. > > AFAICT, function-local metadata is *only* used for indirect > references to instructions and arguments in `@llvm.dbg.value` and > `@llvm.dbg.declare` intrinsics. The first argument of the following > is an example: > > call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0, > metadata !1) > > Note that the debug info people uniformly seem to dislike the status > quo, since it's awkward to get from a `Value` to the corresponding > intrinsic. > > Upgrade path: Instead of using an intrinsic that references a > function-local value through an `MDNode`, attach metadata to the > corresponding argument or instruction, or to the terminating > instruction of the basic block. (This requires new support for > attaching metadata to function arguments, which I'll have to add for > debug info anyway.) > > Is this going to break your compiler? How? Why is your use case worth > supporting? > > [1]: > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141103/242667.html > "r221167 - IR: MDNode => Value: > Instruction::getAllMetadataOtherThanDebugLoc()" > [2]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078581.html > "Re: First-class debug info IR: MDLocation" > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141109/4a618cf2/attachment.html>
Bob Wilson
2014-Nov-10  05:55 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
> On Nov 9, 2014, at 5:49 PM, Chandler Carruth <chandlerc at google.com> wrote: > > FWIW, this completely addresses my only ill feelings about the changes you were making. I really like it.I really like it, too. Eric, thanks for the great suggestion!> > I might bikeshed some of the names, but whatever. I would suggest maybe augmenting some of the doxygen comments to help show the specific use case that motivates the node? You already have this in a few places and it really helped me skimming the code. In particular, there are a bunch of very similar nodes around the tracking, temp, uniquable, etc. and I think it'll be important to clearly distinguish each use case that these are designed to address. > > Also just want to say thanks for diving into the design and finding something (at least, I'm hoping!) works even better. > > On Sun, Nov 9, 2014 at 7:02 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com <mailto:dexonsmith at apple.com>> wrote: > TL;DR: If you use metadata (especially if it's out-of-tree), check the > numbered list of lost functionality below to see whether I'm trying to > break your compiler permanently. > > In response to my recent commits (e.g., [1]) that changed API from > `MDNode` to `Value`, Eric had a really interesting idea [2] -- split > metadata entirely from the `Value` hierarchy, and drop general support > for metadata RAUW. > > After hacking together some example code, this seems overwhelmingly to > me like the right direction. See the attached metadata-v2.patch for my > sketch of what the current metadata primitives might look like in the > new hierarchy (includes LLVMContextImpl uniquing support). > > The initial motivation was to avoid the API weaknesses caused by having > non-`MDNode` metadata that inherits from `Value`. In particular, > instead of changing API from `MDNode` to `Value`, change it to a base > class called `Metadata`, which sheds the underused and expensive `Value` > base class entirely. > > The implications are broader: an enormous reduction in complexity (and > overhead) for metadata. > > This change implies minor (major?) loss of functionality for metadata, > but Eric and I suspect that the only hard-to-fix user is debug info > itself, whose IR infrastructure I'm rewriting anyway. > > Here is what we'd lose: > > 1. No more arbitrary RAUW of metadata. > > While we'd keep support for RAUW of temporary MDNodes for use as > forward declarations (when parsing assembly or constructing cyclic > graphs), drop RAUW support for all other metadata. > > Note that we'd also keep support for RAUW of `Value` operands of > metadata. > > If the RAUW of an operand causes a uniquing collision, uniquing > would be dropped for that node. This matches the current behaviour > when an operand goes to null. > > Upgrade path: none. > > 2. No more function-local metadata. > > AFAICT, function-local metadata is *only* used for indirect > references to instructions and arguments in `@llvm.dbg.value` and > `@llvm.dbg.declare` intrinsics. The first argument of the following > is an example: > > call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0, > metadata !1) > > Note that the debug info people uniformly seem to dislike the status > quo, since it's awkward to get from a `Value` to the corresponding > intrinsic. > > Upgrade path: Instead of using an intrinsic that references a > function-local value through an `MDNode`, attach metadata to the > corresponding argument or instruction, or to the terminating > instruction of the basic block. (This requires new support for > attaching metadata to function arguments, which I'll have to add for > debug info anyway.) > > Is this going to break your compiler? How? Why is your use case worth > supporting? > > [1]: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141103/242667.html <http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141103/242667.html> > "r221167 - IR: MDNode => Value: Instruction::getAllMetadataOtherThanDebugLoc()" > [2]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078581.html <http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078581.html> > "Re: First-class debug info IR: MDLocation" > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev> > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141109/d064c823/attachment.html>
Hal Finkel
2014-Nov-10  06:24 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
----- Original Message -----> From: "Duncan P. N. Exon Smith" <dexonsmith at apple.com> > To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Sunday, November 9, 2014 7:02:43 PM > Subject: [LLVMdev] [RFC] Separating Metadata from the Value hierarchy > > > > TL;DR: If you use metadata (especially if it's out-of-tree), check > the > numbered list of lost functionality below to see whether I'm trying > to > break your compiler permanently. > > In response to my recent commits (e.g., [1]) that changed API from > `MDNode` to `Value`, Eric had a really interesting idea [2] -- split > metadata entirely from the `Value` hierarchy, and drop general > support > for metadata RAUW. > > After hacking together some example code, this seems overwhelmingly > to > me like the right direction. See the attached metadata-v2.patch for > my > sketch of what the current metadata primitives might look like in the > new hierarchy (includes LLVMContextImpl uniquing support). > > The initial motivation was to avoid the API weaknesses caused by > having > non-`MDNode` metadata that inherits from `Value`. In particular, > instead of changing API from `MDNode` to `Value`, change it to a base > class called `Metadata`, which sheds the underused and expensive > `Value` > base class entirely. > > The implications are broader: an enormous reduction in complexity > (and > overhead) for metadata. > > This change implies minor (major?) loss of functionality for > metadata, > but Eric and I suspect that the only hard-to-fix user is debug info > itself, whose IR infrastructure I'm rewriting anyway. > > Here is what we'd lose: > > 1. No more arbitrary RAUW of metadata. > > While we'd keep support for RAUW of temporary MDNodes for use as > forward declarations (when parsing assembly or constructing cyclic > graphs), drop RAUW support for all other metadata.So temporary MDNodes would be Values, but regular metadata would not be? Will regular metadata nodes no longer have lists of users? If I have a TrackingVH<MDNode> with temporary MDNodes, after I call replaceAllUsesWith, what happens? I'm specifically wondering how we'll need to update CloneAliasScopeMetadata in lib/Transforms/Utils/InlineFunction.cpp, and I don't see any fundamental problem with what you've proposed, but these seem like generic upgrade questions. Thanks again, Hal> > Note that we'd also keep support for RAUW of `Value` operands of > metadata. > > If the RAUW of an operand causes a uniquing collision, uniquing > would be dropped for that node. This matches the current behaviour > when an operand goes to null. > > Upgrade path: none. > > 2. No more function-local metadata. > > AFAICT, function-local metadata is *only* used for indirect > references to instructions and arguments in `@llvm.dbg.value` and > `@llvm.dbg.declare` intrinsics. The first argument of the following > is an example: > > call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0, > metadata !1) > > Note that the debug info people uniformly seem to dislike the status > quo, since it's awkward to get from a `Value` to the corresponding > intrinsic. > > Upgrade path: Instead of using an intrinsic that references a > function-local value through an `MDNode`, attach metadata to the > corresponding argument or instruction, or to the terminating > instruction of the basic block. (This requires new support for > attaching metadata to function arguments, which I'll have to add for > debug info anyway.) > > Is this going to break your compiler? How? Why is your use case worth > supporting? > > [1]: > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141103/242667.html > "r221167 - IR: MDNode => Value: > Instruction::getAllMetadataOtherThanDebugLoc()" > [2]: > http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078581.html > "Re: First-class debug info IR: MDLocation" > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Chris Lattner
2014-Nov-10  16:30 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
> On Nov 9, 2014, at 5:02 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: > In response to my recent commits (e.g., [1]) that changed API from > `MDNode` to `Value`, Eric had a really interesting idea [2] -- split > metadata entirely from the `Value` hierarchy, and drop general support > for metadata RAUW.Wow, this never occurred to me, but in hindsight seems like obviously the right direction.> Here is what we'd lose: > > 1. No more arbitrary RAUW of metadata. > > While we'd keep support for RAUW of temporary MDNodes for use as > forward declarations (when parsing assembly or constructing cyclic > graphs), drop RAUW support for all other metadata.This seems fine to me, a corollary of this should be that MDNodes never “move around” in memory due to late uniquing etc, which means that TrackingVH shouldn’t be necessary, right? This should make all frontends a lot more memory efficient because they can just use raw pointers to MDNodes everywhere.> 2. No more function-local metadata.This was a great idea that never mattered, I’m happy to drop it for the greater good :-) Some comments on the patch-in-progress: +++ b/include/llvm/IR/MetadataV2.h +#include "llvm/ADT/ArrayRef.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/PointerUnion.h" +#include "llvm/ADT/StringMap.h" +#include "llvm/IR/Value.h" +#include "llvm/Support/ErrorHandling.h" Please factor things to avoid including heavy headers like DenseMap and StringMap (e.g. by moving the Impl stuff out). +class Metadata { +private: + LLVMContext &Context; Out of curiosity, why do Metadata nodes all need to carry around a context? This seems like memory bloat that would be great to avoid if possible. +template <class T> class TypedMDRef : public MDRef { Is this a safe way to model this, should it use private inheritance instead? Covariance seems like it would break this: specifically something could pass off a typed MDRef as an MDRef&, then the client could reassign something of the wrong type into it. +/// \note This is really cheap and easy to support, and it's useful for +/// implementing: When this makes more progress, the comments should explain what it does and how it works, out of context of the patch. + MDString(LLVMContext *Context) : Metadata(*Context, MDStringKind) { + assert(Context && "Expected non-null context"); ... + static MDStringRef get(LLVMContext &Context, StringRef String); You have reference/pointer disagreement, I’d recommend taking LLVMContext& to the ctor for uniformity (and remove the assert). +class ReplaceableMetadataImpl { I’m not following your algorithm intentions well yet, but this seems like a really heavy-weight implementation, given that this is a common base class for a number of other important things. +class ValueAsMetadata : public Metadata, ReplaceableMetadataImpl { I think that ValueAsMetadata is a great thing to have - it is essential to refer to global variables etc. That said, I think that it will eventually be worthwhile to have metadata versions of integers and other common things to reduce memory. This should come in a later pass of course. I don’t follow the point of +class UniquableMDNode : public MDNode { What would subclass it? What is the use-case? Won’t MDStrings continue to be uniqued? Overall, I’m thrilled to see this direction, thanks for pushing forward on it Duncan! -Chris
Duncan P. N. Exon Smith
2014-Nov-10  17:08 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
> On 2014-Nov-09, at 22:24, Hal Finkel <hfinkel at anl.gov> wrote: > > ----- Original Message ----- >> From: "Duncan P. N. Exon Smith" <dexonsmith at apple.com> >> To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> >> Sent: Sunday, November 9, 2014 7:02:43 PM >> Subject: [LLVMdev] [RFC] Separating Metadata from the Value hierarchy >> >> >> >> TL;DR: If you use metadata (especially if it's out-of-tree), check >> the >> numbered list of lost functionality below to see whether I'm trying >> to >> break your compiler permanently. >> >> In response to my recent commits (e.g., [1]) that changed API from >> `MDNode` to `Value`, Eric had a really interesting idea [2] -- split >> metadata entirely from the `Value` hierarchy, and drop general >> support >> for metadata RAUW. >> >> After hacking together some example code, this seems overwhelmingly >> to >> me like the right direction. See the attached metadata-v2.patch for >> my >> sketch of what the current metadata primitives might look like in the >> new hierarchy (includes LLVMContextImpl uniquing support). >> >> The initial motivation was to avoid the API weaknesses caused by >> having >> non-`MDNode` metadata that inherits from `Value`. In particular, >> instead of changing API from `MDNode` to `Value`, change it to a base >> class called `Metadata`, which sheds the underused and expensive >> `Value` >> base class entirely. >> >> The implications are broader: an enormous reduction in complexity >> (and >> overhead) for metadata. >> >> This change implies minor (major?) loss of functionality for >> metadata, >> but Eric and I suspect that the only hard-to-fix user is debug info >> itself, whose IR infrastructure I'm rewriting anyway. >> >> Here is what we'd lose: >> >> 1. No more arbitrary RAUW of metadata. >> >> While we'd keep support for RAUW of temporary MDNodes for use as >> forward declarations (when parsing assembly or constructing cyclic >> graphs), drop RAUW support for all other metadata. > > So temporary MDNodes would be Values, but regular metadata would not be? Will regular metadata nodes no longer have lists of users? If I have a TrackingVH<MDNode> with temporary MDNodes, after I call replaceAllUsesWith, what happens? > > I'm specifically wondering how we'll need to update CloneAliasScopeMetadata in lib/Transforms/Utils/InlineFunction.cpp, and I don't see any fundamental problem with what you've proposed, but these seem like generic upgrade questions. > > Thanks again, > HalSo, none of them would be `Value`s. `TempMDNode` supports its own, non-`Value`-based, RAUW via `ReplaceableMetadataImpl` (`ValueAsMetadata` uses the same). `CloneAliasScopeMetadata()` should use `TrackingMDRef` in place of `TrackingVH<MDNode>`. `TrackingMDRef` will register itself with the `Metadata` if it's replaceable, and if/when it gets RAUW'ed, the pointer will get updated. If you have another look at the patch it might be more clear now. BTW, another thing I added in that sketch was the option to explicitly request a non-uniqued `MDNode`. Haven't thought through details, but I was specifically thinking this would be a cleaner way to create your non-uniquable alias nodes (instead of the current self-referential approach). My working straw man for assembly syntax looks like this: !1 = metadata distinct !{ ... } As an example, the alias "root" nodes could change from this: !1 = metadata !{ metadata !1 } to this: !1 = metadata distinct !{} Which means constructing them would change from this: MDNode *T = MDNode::getTemporary(Context, None) MDNode *N = MDNode::get(Context, {T}); T->replaceAllUsesWith(N); MDNode::deleteTemporary(T); to this: // In that patch it's "getNonUniqued()". MDNode *N = MDNode::getDistinct(Context, None); Furthermore, if all references to the alias root got dropped, they'd clean themselves up instead of keeping each other alive in a cycle.
Duncan P. N. Exon Smith
2014-Nov-10  18:17 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
> On 2014-Nov-10, at 08:30, Chris Lattner <clattner at apple.com> wrote: > >> On Nov 9, 2014, at 5:02 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: >> In response to my recent commits (e.g., [1]) that changed API from >> `MDNode` to `Value`, Eric had a really interesting idea [2] -- split >> metadata entirely from the `Value` hierarchy, and drop general support >> for metadata RAUW. > > Wow, this never occurred to me, but in hindsight seems like obviously the right direction.Yup, good on Eric here.>> Here is what we'd lose: >> >> 1. No more arbitrary RAUW of metadata. >> >> While we'd keep support for RAUW of temporary MDNodes for use as >> forward declarations (when parsing assembly or constructing cyclic >> graphs), drop RAUW support for all other metadata. > > This seems fine to me, a corollary of this should be that MDNodes never “move around” in memory due to late uniquing etc, which means that TrackingVH shouldn’t be necessary, right? This should make all frontends a lot more memory efficient because they can just use raw pointers to MDNodes everywhere.Almost. Two caveats: 1. Handles should generally be `MDRef` instead of `Metadata*`, since I threw in reference-counting semantics so that no-longer-referenced metadata cleans itself up. 2. If a handle might point to a forward reference -- i.e., a `TempMDNode` in this patch -- it should use `TrackingMDRef`. When the forward reference gets resolved, it updates all of its tracking references. Nevertheless, `sizeof(MDRef) == sizeof(TrackingMDRef) == sizeof(void*)`. Frontends *only* pay memory overhead for the currently-unresolved forward references. (Maybe `MDNodeFwdRef` is a better name than `TempMDNode`?)> >> 2. No more function-local metadata. > > This was a great idea that never mattered, I’m happy to drop it for the greater good :-) >Awesome.> +class Metadata { > +private: > + LLVMContext &Context; > > Out of curiosity, why do Metadata nodes all need to carry around a context? This seems like memory bloat that would be great to avoid if possible.There are two uses for the context: - RAUW. Metadata that can RAUW (`TempMDNode` and `MetadataAsValue`) need a context. Other metadata need access to a context when their operands get RAUW'ed (so they can re-unique themselves), but this could be passed in as an argument, so they don't need their own. - Reference counting. My sketch uses reference counting for all the nodes, so they all need a context to delete themselves when their last reference gets dropped. Customized ownership of debug info metadata will allow us to get the context from parent pointers, so we might be able to remove this down the road (depending on whether the `Metadata` subclass is uniqued).> +template <class T> class TypedMDRef : public MDRef { > > Is this a safe way to model this, should it use private inheritance instead? Covariance seems like it would break this: specifically something could pass off a typed MDRef as an MDRef&, then the client could reassign something of the wrong type into it.Good point.> + MDString(LLVMContext *Context) : Metadata(*Context, MDStringKind) { > + assert(Context && "Expected non-null context"); > ... > + static MDStringRef get(LLVMContext &Context, StringRef String); > > You have reference/pointer disagreement, I’d recommend taking LLVMContext& to the ctor for uniformity (and remove the assert).This is a workaround for `StringMapEntry`'s imperfect forwarding -- it passes the `InitVal` constructor argument by value. I'll fix the problem at its source and clean this up.> +class ReplaceableMetadataImpl { > > I’m not following your algorithm intentions well yet, but this seems like a really heavy-weight implementation, given that this is a common base class for a number of other important things.Unfortunately I think this weight will be hard to optimize out (although I'm sure it could be a little smaller). In the `Value` hierarchy, you pay memory for RAUW at the site of each `Use`. This makes sense, since most values can be RAUW'ed. Since very little metadata needs to be RAUW'ed, we don't want to pay memory at every use-site. Instead, we pay the cost inside the (hopefully) few instances that support RAUW. The core assumption is that there are relatively few replaceable metadata instances. The only subclasses are `ValueAsMetadata` and `TempMDNode`. How many of these are live? - There will be at most one `ValueAsMetadata` instance for each metadata operand that points at a `Value`. My data from a couple of months ago showed that there are very few of these. - `TempMDNodes` are used as forward references. You only pay their cost until the forward reference gets resolved (i.e., deleted). The main case I'm worried about here are references to `ConstantInt`s, which are used pretty heavily outside of debug info (e.g., in `!prof` attachments). However, if that shows up in a profile, we can bypass `Value`s entirely by creating a new `MDIntArray` subclass (or something).> +class ValueAsMetadata : public Metadata, ReplaceableMetadataImpl { > > I think that ValueAsMetadata is a great thing to have - it is essential to refer to global variables etc. That said, I think that it will eventually be worthwhile to have metadata versions of integers and other common things to reduce memory. This should come in a later pass of course.Yup, agree entirely.> > I don’t follow the point of > +class UniquableMDNode : public MDNode { > > What would subclass it? What is the use-case? Won’t MDStrings continue to be uniqued?I think you've just been misled by my terrible name. This sketch splits the "temporary" concept for `MDNode` into a separate subclass, so that non-forward-reference `MDNode`s don't have to pay for RAUW overhead. The class hierarchy I envision looks something like this: Metadata MDNode TempMDNode // MDNodeFwdRef? UniquableMDNode // GenericMDNode? DINode // Is this layer useful? DILocation DIScope DIType DIBasicType DICompositeType ... DISubprogram ... DICompileUnit ... MDString ValueAsMetadata `UniquableMDNode` is a leaf-class that behaves like the current `MDNode` (when it's not a temporary). I called it "uniquable" because, unlike `TempMDNode`, it is uniqued by default (although you can opt-out of it, and uniquing might be dropped). Maybe a better name is `GenericMDNode`? Also, off-topic, but after sketching out the imagined hierarchy above, I'm not sure `DINode` is particularly useful.
Philip Reames
2014-Nov-10  22:13 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
On 11/09/2014 05:02 PM, Duncan P. N. Exon Smith wrote:> TL;DR: If you use metadata (especially if it's out-of-tree), check the > numbered list of lost functionality below to see whether I'm trying to > break your compiler permanently. > > In response to my recent commits (e.g., [1]) that changed API from > `MDNode` to `Value`, Eric had a really interesting idea [2] -- split > metadata entirely from the `Value` hierarchy, and drop general support > for metadata RAUW. > > After hacking together some example code, this seems overwhelmingly to > me like the right direction. See the attached metadata-v2.patch for my > sketch of what the current metadata primitives might look like in the > new hierarchy (includes LLVMContextImpl uniquing support). > > The initial motivation was to avoid the API weaknesses caused by having > non-`MDNode` metadata that inherits from `Value`. In particular, > instead of changing API from `MDNode` to `Value`, change it to a base > class called `Metadata`, which sheds the underused and expensive `Value` > base class entirely. > > The implications are broader: an enormous reduction in complexity (and > overhead) for metadata. > > This change implies minor (major?) loss of functionality for metadata, > but Eric and I suspect that the only hard-to-fix user is debug info > itself, whose IR infrastructure I'm rewriting anyway.I generally support this direction. I do not know of any current use case which would be broken by this change. I do have some hesitations about potential future use though. (see below)> > Here is what we'd lose: > > 1. No more arbitrary RAUW of metadata. > > While we'd keep support for RAUW of temporary MDNodes for use as > forward declarations (when parsing assembly or constructing cyclic > graphs), drop RAUW support for all other metadata. > > Note that we'd also keep support for RAUW of `Value` operands of > metadata. > > If the RAUW of an operand causes a uniquing collision, uniquing > would be dropped for that node. This matches the current behaviour > when an operand goes to null. > > Upgrade path: none. > > 2. No more function-local metadata. > > AFAICT, function-local metadata is *only* used for indirect > references to instructions and arguments in `@llvm.dbg.value` and > `@llvm.dbg.declare` intrinsics. The first argument of the following > is an example: > > call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0, > metadata !1)I hesitate a bit here. The current range metadata deals only with constant ranges, but I could see a day where encoding a range in terms of two other Values might be useful. This is one obvious use case, but there may also be others. I'm not going to oppose the proposal made, but if there was a way to preserve the ability to reference to function local SSA values, I'd slightly prefer that.> Note that the debug info people uniformly seem to dislike the status > quo, since it's awkward to get from a `Value` to the corresponding > intrinsic. > > Upgrade path: Instead of using an intrinsic that references a > function-local value through an `MDNode`, attach metadata to the > corresponding argument or instruction, or to the terminating > instruction of the basic block. (This requires new support for > attaching metadata to function arguments, which I'll have to add for > debug info anyway.) > > Is this going to break your compiler? How? Why is your use case worth > supporting? > > [1]: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141103/242667.html > "r221167 - IR: MDNode => Value: Instruction::getAllMetadataOtherThanDebugLoc()" > [2]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078581.html > "Re: First-class debug info IR: MDLocation" > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141110/b44da2d5/attachment.html>
Duncan P. N. Exon Smith
2014-Nov-11  00:08 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
> On 2014-Nov-10, at 14:13, Philip Reames <listmail at philipreames.com> wrote: > >> 2. No more function-local metadata. >> >> AFAICT, function-local metadata is *only* used for indirect >> references to instructions and arguments in `@llvm.dbg.value` and >> `@llvm.dbg.declare` intrinsics. The first argument of the following >> is an example: >> >> call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0, >> metadata !1) >> > I hesitate a bit here. The current range metadata deals only with constant ranges, but I could see a day where encoding a range in terms of two other Values might be useful. This is one obvious use case, but there may also be others. > > I'm not going to oppose the proposal made, but if there was a way to preserve the ability to reference to function local SSA values, I'd slightly prefer that.In theory it's possible, but it adds a large complexity burden because of the need to track whether metadata is local or global. I strongly prefer dropping it until a compelling use-case appears. Note that debug info is rather special in that it's *not allowed* to modify optimizations or code generation. For non-debug info, you can reference local values directly using intrinsics, as (e.g.) @llvm.assume [1] does. [1]: http://llvm.org/docs/LangRef.html#llvm-assume-intrinsic
Duncan P. N. Exon Smith
2014-Nov-12  21:00 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
If you don't care about function-local metadata and debug info intrinsics, skip ahead to the section on assembly syntax in case you have comments on that.> On 2014-Nov-09, at 17:02, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: > > 2. No more function-local metadata. > > AFAICT, function-local metadata is *only* used for indirect > references to instructions and arguments in `@llvm.dbg.value` and > `@llvm.dbg.declare` intrinsics. The first argument of the following > is an example: > > call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0, > metadata !1) > > Note that the debug info people uniformly seem to dislike the status > quo, since it's awkward to get from a `Value` to the corresponding > intrinsic. > > Upgrade path: Instead of using an intrinsic that references a > function-local value through an `MDNode`, attach metadata to the > corresponding argument or instruction, or to the terminating > instruction of the basic block. (This requires new support for > attaching metadata to function arguments, which I'll have to add for > debug info anyway.)llvm::Argument attachments are hard ================================== I've been looking at prototyping metadata attachments to `llvm::Argument`, which is key to replacing debug info intrinsics. It's a fairly big piece of new IR, and comes with its own subtle semantic decisions. What do you do with metadata attached to arguments when you inline a function? If the arguments are remapped to other instructions (or arguments), they may have metadata of the same kind attached. Does one win? Which one? Or are they merged? What if the arguments get remapped to constants? What about when a function is cloned? While the rest of this metadata-is-not-a-value proposal is effectively NFC, this `Argument` part could introduce problems. If I rewrite debug info intrinsics as argument attachments and then immediately split `Metadata` from `Value`, any semantic subtleties will be difficult to diagnose in the noise of the rest of the changes. While I was looking at this as a shortcut to avoid porting function-local metadata, I think it introduces more points of failure than problems it solves. Limited function-local metadata ------------------------------- Instead, I propose porting a limited form of function-local metadata, whose use is severely restricted but covers our current use cases (keep reading for details). We can defer replacing debug info intrinsics until the infrastructure has settled down and is stable. Assembly syntax ============== This is a good time to talk about assembly syntax, since it will demonstrate what I'm thinking for function-local metadata. Assembly syntax is important. It's our view into the IR. If metadata is typeless (and not a `Value`), that should be reflected in the assembly syntax. Old syntax ---------- There are four places metadata can be used/reference in the IR. 1. Operands of `MDNode`. !0 = metadata !{metadata !"string", metadata !1, i32* @global) Notice that the `@global` argument is not metadata: it's an `llvm::Constant`. In the new IR, these will be wrapped in a `ValueAsMetadata` instance. 2. Operands of `NamedMDNode` (yes, they're different). !named = metadata !{metadata !0, metadata !1} These operands are always `MDNode`. 3. Attachments to instructions. %inst = load i32* @global, !dbg !0 Notice that we already skip the `metadata` type here. 4. Arguments to intrinsics. call void @llvm.dbg(metadata !{i32 %inst}, metadata !0) The first argument is subtle -- that's a function-local `MDNode` with `%inst` as its only operand. In the new IR, the second operand will be a `MetadataAsValue` instance that contains a reference to the `MDNode` numbered `!0`. New syntax ---------- Types only make sense when an operand can be an `llvm::Value`. Let's remove them where they don't make sense. I propose the following syntax for the above examples, using a new keyword, `value`: 1. Operands of `MDNode`. Drop `metadata`, since metadata doesn't have types. Use `value` to indicate a wrapped `llvm::Value`. !0 = !{!"string", !1, value i32* @global) 2. Operands of `NamedMDNode`. Drop `metadata`, since metadata doesn't have types. !named = !{!0, !1} 3. Attachments to instructions. No change! %inst = load i32* @global, !dbg !0 4. Arguments to intrinsics. Keep `metadata`, since here it's wrapped into an `llvm::Value` (which has a type). Use `value` to indicate a metadata-wrapped value. call void @llvm.dbg(metadata value i32 %inst, metadata !0) Notice that the first argument doesn't use an `MDNode` anymore. Restrictions on function-local metadata ====================================== In the new IR, function-local metadata (say, `LocalValueAsMetadata`) *cannot* be used as an operand to metadata -- the only legal place for it is in a `MetadataAsValue` instance. This prevents the additional complexity from poisoning the rest of the metadata hierarchy. Effectively, this restricts function-local metadata to direct operands of intrinsics.
Adrian Prantl
2014-Nov-13  19:17 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
> On Nov 12, 2014, at 1:00 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: > > If you don't care about function-local metadata and debug info > intrinsics, skip ahead to the section on assembly syntax in case you > have comments on that. > >> On 2014-Nov-09, at 17:02, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: >> >> 2. No more function-local metadata. >> >> AFAICT, function-local metadata is *only* used for indirect >> references to instructions and arguments in `@llvm.dbg.value` and >> `@llvm.dbg.declare` intrinsics. The first argument of the following >> is an example: >> >> call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0, >> metadata !1) >> >> Note that the debug info people uniformly seem to dislike the status >> quo, since it's awkward to get from a `Value` to the corresponding >> intrinsic. >> >> Upgrade path: Instead of using an intrinsic that references a >> function-local value through an `MDNode`, attach metadata to the >> corresponding argument or instruction, or to the terminating >> instruction of the basic block. (This requires new support for >> attaching metadata to function arguments, which I'll have to add for >> debug info anyway.) > > llvm::Argument attachments are hard > ==================================> > I've been looking at prototyping metadata attachments to > `llvm::Argument`, which is key to replacing debug info intrinsics. > > It's a fairly big piece of new IR, and comes with its own subtle > semantic decisions. What do you do with metadata attached to arguments > when you inline a function? If the arguments are remapped to other > instructions (or arguments), they may have metadata of the same kind > attached. Does one win? Which one? Or are they merged? What if the > arguments get remapped to constants? What about when a function is > cloned? > > While the rest of this metadata-is-not-a-value proposal is effectively > NFC, this `Argument` part could introduce problems. If I rewrite debug > info intrinsics as argument attachments and then immediately split > `Metadata` from `Value`, any semantic subtleties will be difficult to > diagnose in the noise of the rest of the changes. > > While I was looking at this as a shortcut to avoid porting > function-local metadata, I think it introduces more points of failure > than problems it solves.One thing to consider is that there are cases were we describe function arguments without referencing the argument in the intrinsic at all. Currently, if you compile $ cat struct.c struct s { int a; int b; }; int foo(struct s s1) { return s1.a; } $ clang -g -O1 -arch x86_64 -S -emit-llvm struct.c we cannot preserve the debug info for the argument so it is turned into an intrinsic describing an undef value. ; Function Attrs: nounwind readnone ssp uwtable define i32 @foo(i64 %s1.coerce) #0 { entry: %s1.sroa.0.0.extract.trunc = trunc i64 %s1.coerce to i32 tail call void @llvm.dbg.declare(metadata !18, metadata !14, metadata !19), !dbg !20 ret i32 %s1.sroa.0.0.extract.trunc, !dbg !21 } !18 = metadata !{%struct.s* undef} Note that it is critical for this DIVariable to make it into the debug info even if it is undefined, or the function arguments won’t match the function signature. -- adrian> > Limited function-local metadata > ------------------------------- > > Instead, I propose porting a limited form of function-local metadata, > whose use is severely restricted but covers our current use cases (keep > reading for details). We can defer replacing debug info intrinsics > until the infrastructure has settled down and is stable. > > Assembly syntax > ==============> > This is a good time to talk about assembly syntax, since it will > demonstrate what I'm thinking for function-local metadata. > > Assembly syntax is important. It's our view into the IR. If metadata > is typeless (and not a `Value`), that should be reflected in the > assembly syntax. > > Old syntax > ---------- > > There are four places metadata can be used/reference in the IR. > > 1. Operands of `MDNode`. > > !0 = metadata !{metadata !"string", metadata !1, i32* @global) > > Notice that the `@global` argument is not metadata: it's an > `llvm::Constant`. In the new IR, these will be wrapped in a > `ValueAsMetadata` instance. > > 2. Operands of `NamedMDNode` (yes, they're different). > > !named = metadata !{metadata !0, metadata !1} > > These operands are always `MDNode`. > > 3. Attachments to instructions. > > %inst = load i32* @global, !dbg !0 > > Notice that we already skip the `metadata` type here. > > 4. Arguments to intrinsics. > > call void @llvm.dbg(metadata !{i32 %inst}, metadata !0) > > The first argument is subtle -- that's a function-local `MDNode` > with `%inst` as its only operand. > > In the new IR, the second operand will be a `MetadataAsValue` > instance that contains a reference to the `MDNode` numbered `!0`. > > New syntax > ---------- > > Types only make sense when an operand can be an `llvm::Value`. Let's > remove them where they don't make sense. > > I propose the following syntax for the above examples, using a new > keyword, `value`: > > 1. Operands of `MDNode`. Drop `metadata`, since metadata doesn't have > types. Use `value` to indicate a wrapped `llvm::Value`. > > !0 = !{!"string", !1, value i32* @global) > > 2. Operands of `NamedMDNode`. Drop `metadata`, since metadata doesn't > have types. > > !named = !{!0, !1} > > 3. Attachments to instructions. No change! > > %inst = load i32* @global, !dbg !0 > > 4. Arguments to intrinsics. Keep `metadata`, since here it's wrapped > into an `llvm::Value` (which has a type). Use `value` to indicate a > metadata-wrapped value. > > call void @llvm.dbg(metadata value i32 %inst, metadata !0) > > Notice that the first argument doesn't use an `MDNode` anymore. > > Restrictions on function-local metadata > ======================================> > In the new IR, function-local metadata (say, `LocalValueAsMetadata`) > *cannot* be used as an operand to metadata -- the only legal place for > it is in a `MetadataAsValue` instance. This prevents the additional > complexity from poisoning the rest of the metadata hierarchy. > > Effectively, this restricts function-local metadata to direct operands > of intrinsics. >
Philip Reames
2014-Nov-13  21:34 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
On 11/12/2014 01:00 PM, Duncan P. N. Exon Smith wrote:> If you don't care about function-local metadata and debug info > intrinsics, skip ahead to the section on assembly syntax in case you > have comments on that. > >> On 2014-Nov-09, at 17:02, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: >> >> 2. No more function-local metadata. >> >> AFAICT, function-local metadata is *only* used for indirect >> references to instructions and arguments in `@llvm.dbg.value` and >> `@llvm.dbg.declare` intrinsics. The first argument of the following >> is an example: >> >> call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0, >> metadata !1) >> >> Note that the debug info people uniformly seem to dislike the status >> quo, since it's awkward to get from a `Value` to the corresponding >> intrinsic. >> >> Upgrade path: Instead of using an intrinsic that references a >> function-local value through an `MDNode`, attach metadata to the >> corresponding argument or instruction, or to the terminating >> instruction of the basic block. (This requires new support for >> attaching metadata to function arguments, which I'll have to add for >> debug info anyway.) > llvm::Argument attachments are hard > ==================================> > I've been looking at prototyping metadata attachments to > `llvm::Argument`, which is key to replacing debug info intrinsics. > > It's a fairly big piece of new IR, and comes with its own subtle > semantic decisions. What do you do with metadata attached to arguments > when you inline a function? If the arguments are remapped to other > instructions (or arguments), they may have metadata of the same kind > attached. Does one win? Which one? Or are they merged? What if the > arguments get remapped to constants? What about when a function is > cloned? > > While the rest of this metadata-is-not-a-value proposal is effectively > NFC, this `Argument` part could introduce problems. If I rewrite debug > info intrinsics as argument attachments and then immediately split > `Metadata` from `Value`, any semantic subtleties will be difficult to > diagnose in the noise of the rest of the changes. > > While I was looking at this as a shortcut to avoid porting > function-local metadata, I think it introduces more points of failure > than problems it solves. > > Limited function-local metadata > ------------------------------- > > Instead, I propose porting a limited form of function-local metadata, > whose use is severely restricted but covers our current use cases (keep > reading for details). We can defer replacing debug info intrinsics > until the infrastructure has settled down and is stable.This seems entirely reasonable. Long term, supporting metadata on arguments would be useful, but we should also have a broader discussion about the role of attributes and metadata before we do that.> > Assembly syntax > ==============> > This is a good time to talk about assembly syntax, since it will > demonstrate what I'm thinking for function-local metadata. > > Assembly syntax is important. It's our view into the IR. If metadata > is typeless (and not a `Value`), that should be reflected in the > assembly syntax. > > Old syntax > ---------- > > There are four places metadata can be used/reference in the IR. > > 1. Operands of `MDNode`. > > !0 = metadata !{metadata !"string", metadata !1, i32* @global) > > Notice that the `@global` argument is not metadata: it's an > `llvm::Constant`. In the new IR, these will be wrapped in a > `ValueAsMetadata` instance. > > 2. Operands of `NamedMDNode` (yes, they're different). > > !named = metadata !{metadata !0, metadata !1} > > These operands are always `MDNode`. > > 3. Attachments to instructions. > > %inst = load i32* @global, !dbg !0 > > Notice that we already skip the `metadata` type here. > > 4. Arguments to intrinsics. > > call void @llvm.dbg(metadata !{i32 %inst}, metadata !0) > > The first argument is subtle -- that's a function-local `MDNode` > with `%inst` as its only operand. > > In the new IR, the second operand will be a `MetadataAsValue` > instance that contains a reference to the `MDNode` numbered `!0`. > > New syntax > ---------- > > Types only make sense when an operand can be an `llvm::Value`. Let's > remove them where they don't make sense.Hm, how does this interact with range metadata? Currently, the type of the values making up the range have to match the instruction they're attached to. This seems like it could be a change in behaviour. Thinking about it, it doesn't seem like a bad change, but it is a change. Are there other cases like this?> > I propose the following syntax for the above examples, using a new > keyword, `value`: > > 1. Operands of `MDNode`. Drop `metadata`, since metadata doesn't have > types. Use `value` to indicate a wrapped `llvm::Value`. > > !0 = !{!"string", !1, value i32* @global) > > 2. Operands of `NamedMDNode`. Drop `metadata`, since metadata doesn't > have types. > > !named = !{!0, !1} > > 3. Attachments to instructions. No change! > > %inst = load i32* @global, !dbg !0 > > 4. Arguments to intrinsics. Keep `metadata`, since here it's wrapped > into an `llvm::Value` (which has a type). Use `value` to indicate a > metadata-wrapped value. > > call void @llvm.dbg(metadata value i32 %inst, metadata !0) > > Notice that the first argument doesn't use an `MDNode` anymore. > > Restrictions on function-local metadata > ======================================> > In the new IR, function-local metadata (say, `LocalValueAsMetadata`) > *cannot* be used as an operand to metadata -- the only legal place for > it is in a `MetadataAsValue` instance. This prevents the additional > complexity from poisoning the rest of the metadata hierarchy. > > Effectively, this restricts function-local metadata to direct operands > of intrinsics.This seems entirely reasonable. Philip
Eric Christopher
2014-Nov-14  22:11 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
Seems reasonable. I'd love to get rid of the intrinsics but... yeah. Thanks Duncan. -eric On Wed Nov 12 2014 at 1:00:21 PM Duncan P. N. Exon Smith < dexonsmith at apple.com> wrote:> If you don't care about function-local metadata and debug info > intrinsics, skip ahead to the section on assembly syntax in case you > have comments on that. > > > On 2014-Nov-09, at 17:02, Duncan P. N. Exon Smith <dexonsmith at apple.com> > wrote: > > > > 2. No more function-local metadata. > > > > AFAICT, function-local metadata is *only* used for indirect > > references to instructions and arguments in `@llvm.dbg.value` and > > `@llvm.dbg.declare` intrinsics. The first argument of the following > > is an example: > > > > call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0, > > metadata !1) > > > > Note that the debug info people uniformly seem to dislike the status > > quo, since it's awkward to get from a `Value` to the corresponding > > intrinsic. > > > > Upgrade path: Instead of using an intrinsic that references a > > function-local value through an `MDNode`, attach metadata to the > > corresponding argument or instruction, or to the terminating > > instruction of the basic block. (This requires new support for > > attaching metadata to function arguments, which I'll have to add for > > debug info anyway.) > > llvm::Argument attachments are hard > ==================================> > I've been looking at prototyping metadata attachments to > `llvm::Argument`, which is key to replacing debug info intrinsics. > > It's a fairly big piece of new IR, and comes with its own subtle > semantic decisions. What do you do with metadata attached to arguments > when you inline a function? If the arguments are remapped to other > instructions (or arguments), they may have metadata of the same kind > attached. Does one win? Which one? Or are they merged? What if the > arguments get remapped to constants? What about when a function is > cloned? > > While the rest of this metadata-is-not-a-value proposal is effectively > NFC, this `Argument` part could introduce problems. If I rewrite debug > info intrinsics as argument attachments and then immediately split > `Metadata` from `Value`, any semantic subtleties will be difficult to > diagnose in the noise of the rest of the changes. > > While I was looking at this as a shortcut to avoid porting > function-local metadata, I think it introduces more points of failure > than problems it solves. > > Limited function-local metadata > ------------------------------- > > Instead, I propose porting a limited form of function-local metadata, > whose use is severely restricted but covers our current use cases (keep > reading for details). We can defer replacing debug info intrinsics > until the infrastructure has settled down and is stable. > > Assembly syntax > ==============> > This is a good time to talk about assembly syntax, since it will > demonstrate what I'm thinking for function-local metadata. > > Assembly syntax is important. It's our view into the IR. If metadata > is typeless (and not a `Value`), that should be reflected in the > assembly syntax. > > Old syntax > ---------- > > There are four places metadata can be used/reference in the IR. > > 1. Operands of `MDNode`. > > !0 = metadata !{metadata !"string", metadata !1, i32* @global) > > Notice that the `@global` argument is not metadata: it's an > `llvm::Constant`. In the new IR, these will be wrapped in a > `ValueAsMetadata` instance. > > 2. Operands of `NamedMDNode` (yes, they're different). > > !named = metadata !{metadata !0, metadata !1} > > These operands are always `MDNode`. > > 3. Attachments to instructions. > > %inst = load i32* @global, !dbg !0 > > Notice that we already skip the `metadata` type here. > > 4. Arguments to intrinsics. > > call void @llvm.dbg(metadata !{i32 %inst}, metadata !0) > > The first argument is subtle -- that's a function-local `MDNode` > with `%inst` as its only operand. > > In the new IR, the second operand will be a `MetadataAsValue` > instance that contains a reference to the `MDNode` numbered `!0`. > > New syntax > ---------- > > Types only make sense when an operand can be an `llvm::Value`. Let's > remove them where they don't make sense. > > I propose the following syntax for the above examples, using a new > keyword, `value`: > > 1. Operands of `MDNode`. Drop `metadata`, since metadata doesn't have > types. Use `value` to indicate a wrapped `llvm::Value`. > > !0 = !{!"string", !1, value i32* @global) > > 2. Operands of `NamedMDNode`. Drop `metadata`, since metadata doesn't > have types. > > !named = !{!0, !1} > > 3. Attachments to instructions. No change! > > %inst = load i32* @global, !dbg !0 > > 4. Arguments to intrinsics. Keep `metadata`, since here it's wrapped > into an `llvm::Value` (which has a type). Use `value` to indicate a > metadata-wrapped value. > > call void @llvm.dbg(metadata value i32 %inst, metadata !0) > > Notice that the first argument doesn't use an `MDNode` anymore. > > Restrictions on function-local metadata > ======================================> > In the new IR, function-local metadata (say, `LocalValueAsMetadata`) > *cannot* be used as an operand to metadata -- the only legal place for > it is in a `MetadataAsValue` instance. This prevents the additional > complexity from poisoning the rest of the metadata hierarchy. > > Effectively, this restricts function-local metadata to direct operands > of intrinsics. > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141114/9513a766/attachment.html>