thr3ads.net - llvm dev - [LLVMdev] [RFC] Separating Metadata from the Value hierarchy [Nov 2014]

If this information is useful, please help other people find it:
Share via:

Duncan P. N. Exon Smith

2014-Nov-10 01:02 UTC

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

TL;DR: If you use metadata (especially if it's out-of-tree), check the
numbered list of lost functionality below to see whether I'm trying to
break your compiler permanently.

In response to my recent commits (e.g., [1]) that changed API from
`MDNode` to `Value`, Eric had a really interesting idea [2] -- split
metadata entirely from the `Value` hierarchy, and drop general support
for metadata RAUW.

After hacking together some example code, this seems overwhelmingly to
me like the right direction.  See the attached metadata-v2.patch for my
sketch of what the current metadata primitives might look like in the
new hierarchy (includes LLVMContextImpl uniquing support).

The initial motivation was to avoid the API weaknesses caused by having
non-`MDNode` metadata that inherits from `Value`.  In particular,
instead of changing API from `MDNode` to `Value`, change it to a base
class called `Metadata`, which sheds the underused and expensive `Value`
base class entirely.

The implications are broader: an enormous reduction in complexity (and
overhead) for metadata.

This change implies minor (major?) loss of functionality for metadata,
but Eric and I suspect that the only hard-to-fix user is debug info
itself, whose IR infrastructure I'm rewriting anyway.

Here is what we'd lose:

 1. No more arbitrary RAUW of metadata.

    While we'd keep support for RAUW of temporary MDNodes for use as
    forward declarations (when parsing assembly or constructing cyclic
    graphs), drop RAUW support for all other metadata.

    Note that we'd also keep support for RAUW of `Value` operands of
    metadata.

    If the RAUW of an operand causes a uniquing collision, uniquing
    would be dropped for that node.  This matches the current behaviour
    when an operand goes to null.

    Upgrade path: none.

 2. No more function-local metadata.

    AFAICT, function-local metadata is *only* used for indirect
    references to instructions and arguments in `@llvm.dbg.value` and
    `@llvm.dbg.declare` intrinsics.  The first argument of the following
    is an example:

        call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
                                  metadata !1)

    Note that the debug info people uniformly seem to dislike the status
    quo, since it's awkward to get from a `Value` to the corresponding
    intrinsic.

    Upgrade path: Instead of using an intrinsic that references a
    function-local value through an `MDNode`, attach metadata to the
    corresponding argument or instruction, or to the terminating
    instruction of the basic block.  (This requires new support for
    attaching metadata to function arguments, which I'll have to add for
    debug info anyway.)

Is this going to break your compiler?  How?  Why is your use case worth
supporting?

[1]:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141103/242667.html
    "r221167 - IR: MDNode => Value:
Instruction::getAllMetadataOtherThanDebugLoc()"
[2]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078581.html
    "Re: First-class debug info IR: MDLocation"

-------------- next part --------------
A non-text attachment was scrubbed...
Name: metadata-v2.patch
Type: application/octet-stream
Size: 31022 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141109/4d4fffcf/attachment.obj>

Chandler Carruth

2014-Nov-10 01:49 UTC

head link

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

FWIW, this completely addresses my only ill feelings about the changes you
were making. I really like it.

I might bikeshed some of the names, but whatever. I would suggest maybe
augmenting some of the doxygen comments to help show the specific use case
that motivates the node? You already have this in a few places and it
really helped me skimming the code. In particular, there are a bunch of
very similar nodes around the tracking, temp, uniquable, etc. and I think
it'll be important to clearly distinguish each use case that these are
designed to address.

Also just want to say thanks for diving into the design and finding
something (at least, I'm hoping!) works even better.

On Sun, Nov 9, 2014 at 7:02 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:
> TL;DR: If you use metadata (especially if it's out-of-tree), check the
> numbered list of lost functionality below to see whether I'm trying to
> break your compiler permanently.
>
> In response to my recent commits (e.g., [1]) that changed API from
> `MDNode` to `Value`, Eric had a really interesting idea [2] -- split
> metadata entirely from the `Value` hierarchy, and drop general support
> for metadata RAUW.
>
> After hacking together some example code, this seems overwhelmingly to
> me like the right direction.  See the attached metadata-v2.patch for my
> sketch of what the current metadata primitives might look like in the
> new hierarchy (includes LLVMContextImpl uniquing support).
>
> The initial motivation was to avoid the API weaknesses caused by having
> non-`MDNode` metadata that inherits from `Value`.  In particular,
> instead of changing API from `MDNode` to `Value`, change it to a base
> class called `Metadata`, which sheds the underused and expensive `Value`
> base class entirely.
>
> The implications are broader: an enormous reduction in complexity (and
> overhead) for metadata.
>
> This change implies minor (major?) loss of functionality for metadata,
> but Eric and I suspect that the only hard-to-fix user is debug info
> itself, whose IR infrastructure I'm rewriting anyway.
>
> Here is what we'd lose:
>
>  1. No more arbitrary RAUW of metadata.
>
>     While we'd keep support for RAUW of temporary MDNodes for use as
>     forward declarations (when parsing assembly or constructing cyclic
>     graphs), drop RAUW support for all other metadata.
>
>     Note that we'd also keep support for RAUW of `Value` operands of
>     metadata.
>
>     If the RAUW of an operand causes a uniquing collision, uniquing
>     would be dropped for that node.  This matches the current behaviour
>     when an operand goes to null.
>
>     Upgrade path: none.
>
>  2. No more function-local metadata.
>
>     AFAICT, function-local metadata is *only* used for indirect
>     references to instructions and arguments in `@llvm.dbg.value` and
>     `@llvm.dbg.declare` intrinsics.  The first argument of the following
>     is an example:
>
>         call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
>                                   metadata !1)
>
>     Note that the debug info people uniformly seem to dislike the status
>     quo, since it's awkward to get from a `Value` to the corresponding
>     intrinsic.
>
>     Upgrade path: Instead of using an intrinsic that references a
>     function-local value through an `MDNode`, attach metadata to the
>     corresponding argument or instruction, or to the terminating
>     instruction of the basic block.  (This requires new support for
>     attaching metadata to function arguments, which I'll have to add
for
>     debug info anyway.)
>
> Is this going to break your compiler?  How?  Why is your use case worth
> supporting?
>
> [1]:
>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141103/242667.html
>     "r221167 - IR: MDNode => Value:
> Instruction::getAllMetadataOtherThanDebugLoc()"
> [2]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078581.html
>     "Re: First-class debug info IR: MDLocation"
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141109/4a618cf2/attachment.html>

Bob Wilson

2014-Nov-10 05:55 UTC

head link

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

> On Nov 9, 2014, at 5:49 PM, Chandler Carruth <chandlerc at
google.com> wrote:
> 
> FWIW, this completely addresses my only ill feelings about the changes you
were making. I really like it.
I really like it, too.

Eric, thanks for the great suggestion!
> 
> I might bikeshed some of the names, but whatever. I would suggest maybe
augmenting some of the doxygen comments to help show the specific use case that
motivates the node? You already have this in a few places and it really helped
me skimming the code. In particular, there are a bunch of very similar nodes
around the tracking, temp, uniquable, etc. and I think it'll be important to
clearly distinguish each use case that these are designed to address.
> 
> Also just want to say thanks for diving into the design and finding
something (at least, I'm hoping!) works even better.
> 
> On Sun, Nov 9, 2014 at 7:02 PM, Duncan P. N. Exon Smith <dexonsmith at
apple.com <mailto:dexonsmith at apple.com>> wrote:
> TL;DR: If you use metadata (especially if it's out-of-tree), check the
> numbered list of lost functionality below to see whether I'm trying to
> break your compiler permanently.
> 
> In response to my recent commits (e.g., [1]) that changed API from
> `MDNode` to `Value`, Eric had a really interesting idea [2] -- split
> metadata entirely from the `Value` hierarchy, and drop general support
> for metadata RAUW.
> 
> After hacking together some example code, this seems overwhelmingly to
> me like the right direction.  See the attached metadata-v2.patch for my
> sketch of what the current metadata primitives might look like in the
> new hierarchy (includes LLVMContextImpl uniquing support).
> 
> The initial motivation was to avoid the API weaknesses caused by having
> non-`MDNode` metadata that inherits from `Value`.  In particular,
> instead of changing API from `MDNode` to `Value`, change it to a base
> class called `Metadata`, which sheds the underused and expensive `Value`
> base class entirely.
> 
> The implications are broader: an enormous reduction in complexity (and
> overhead) for metadata.
> 
> This change implies minor (major?) loss of functionality for metadata,
> but Eric and I suspect that the only hard-to-fix user is debug info
> itself, whose IR infrastructure I'm rewriting anyway.
> 
> Here is what we'd lose:
> 
>  1. No more arbitrary RAUW of metadata.
> 
>     While we'd keep support for RAUW of temporary MDNodes for use as
>     forward declarations (when parsing assembly or constructing cyclic
>     graphs), drop RAUW support for all other metadata.
> 
>     Note that we'd also keep support for RAUW of `Value` operands of
>     metadata.
> 
>     If the RAUW of an operand causes a uniquing collision, uniquing
>     would be dropped for that node.  This matches the current behaviour
>     when an operand goes to null.
> 
>     Upgrade path: none.
> 
>  2. No more function-local metadata.
> 
>     AFAICT, function-local metadata is *only* used for indirect
>     references to instructions and arguments in `@llvm.dbg.value` and
>     `@llvm.dbg.declare` intrinsics.  The first argument of the following
>     is an example:
> 
>         call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
>                                   metadata !1)
> 
>     Note that the debug info people uniformly seem to dislike the status
>     quo, since it's awkward to get from a `Value` to the corresponding
>     intrinsic.
> 
>     Upgrade path: Instead of using an intrinsic that references a
>     function-local value through an `MDNode`, attach metadata to the
>     corresponding argument or instruction, or to the terminating
>     instruction of the basic block.  (This requires new support for
>     attaching metadata to function arguments, which I'll have to add
for
>     debug info anyway.)
> 
> Is this going to break your compiler?  How?  Why is your use case worth
> supporting?
> 
> [1]:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141103/242667.html
<http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141103/242667.html>
>     "r221167 - IR: MDNode => Value:
Instruction::getAllMetadataOtherThanDebugLoc()"
> [2]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078581.html
<http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078581.html>
>     "Re: First-class debug info IR: MDLocation"
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141109/d064c823/attachment.html>

Hal Finkel

2014-Nov-10 06:24 UTC

head link

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

----- Original Message -----> From: "Duncan P. N. Exon Smith" <dexonsmith at apple.com>
> To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Sunday, November 9, 2014 7:02:43 PM
> Subject: [LLVMdev] [RFC] Separating Metadata from the Value hierarchy
> 
> 
> 
> TL;DR: If you use metadata (especially if it's out-of-tree), check
> the
> numbered list of lost functionality below to see whether I'm trying
> to
> break your compiler permanently.
> 
> In response to my recent commits (e.g., [1]) that changed API from
> `MDNode` to `Value`, Eric had a really interesting idea [2] -- split
> metadata entirely from the `Value` hierarchy, and drop general
> support
> for metadata RAUW.
> 
> After hacking together some example code, this seems overwhelmingly
> to
> me like the right direction. See the attached metadata-v2.patch for
> my
> sketch of what the current metadata primitives might look like in the
> new hierarchy (includes LLVMContextImpl uniquing support).
> 
> The initial motivation was to avoid the API weaknesses caused by
> having
> non-`MDNode` metadata that inherits from `Value`. In particular,
> instead of changing API from `MDNode` to `Value`, change it to a base
> class called `Metadata`, which sheds the underused and expensive
> `Value`
> base class entirely.
> 
> The implications are broader: an enormous reduction in complexity
> (and
> overhead) for metadata.
> 
> This change implies minor (major?) loss of functionality for
> metadata,
> but Eric and I suspect that the only hard-to-fix user is debug info
> itself, whose IR infrastructure I'm rewriting anyway.
> 
> Here is what we'd lose:
> 
> 1. No more arbitrary RAUW of metadata.
> 
> While we'd keep support for RAUW of temporary MDNodes for use as
> forward declarations (when parsing assembly or constructing cyclic
> graphs), drop RAUW support for all other metadata.
So temporary MDNodes would be Values, but regular metadata would not be? Will
regular metadata nodes no longer have lists of users? If I have a
TrackingVH<MDNode> with temporary MDNodes, after I call
replaceAllUsesWith, what happens?

I'm specifically wondering how we'll need to update
CloneAliasScopeMetadata in lib/Transforms/Utils/InlineFunction.cpp, and I
don't see any fundamental problem with what you've proposed, but these
seem like generic upgrade questions.

Thanks again,
Hal
> 
> Note that we'd also keep support for RAUW of `Value` operands of
> metadata.
> 
> If the RAUW of an operand causes a uniquing collision, uniquing
> would be dropped for that node. This matches the current behaviour
> when an operand goes to null.
> 
> Upgrade path: none.
> 
> 2. No more function-local metadata.
> 
> AFAICT, function-local metadata is *only* used for indirect
> references to instructions and arguments in `@llvm.dbg.value` and
> `@llvm.dbg.declare` intrinsics. The first argument of the following
> is an example:
> 
> call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
> metadata !1)
> 
> Note that the debug info people uniformly seem to dislike the status
> quo, since it's awkward to get from a `Value` to the corresponding
> intrinsic.
> 
> Upgrade path: Instead of using an intrinsic that references a
> function-local value through an `MDNode`, attach metadata to the
> corresponding argument or instruction, or to the terminating
> instruction of the basic block. (This requires new support for
> attaching metadata to function arguments, which I'll have to add for
> debug info anyway.)
> 
> Is this going to break your compiler? How? Why is your use case worth
> supporting?
> 
> [1]:
>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141103/242667.html
> "r221167 - IR: MDNode => Value:
> Instruction::getAllMetadataOtherThanDebugLoc()"
> [2]:
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078581.html
> "Re: First-class debug info IR: MDLocation"
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Chris Lattner

2014-Nov-10 16:30 UTC

head link

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

> On Nov 9, 2014, at 5:02 PM, Duncan P. N. Exon Smith <dexonsmith at
apple.com> wrote:
> In response to my recent commits (e.g., [1]) that changed API from
> `MDNode` to `Value`, Eric had a really interesting idea [2] -- split
> metadata entirely from the `Value` hierarchy, and drop general support
> for metadata RAUW.
Wow, this never occurred to me, but in hindsight seems like obviously the right
direction.
> Here is what we'd lose:
> 
> 1. No more arbitrary RAUW of metadata.
> 
>    While we'd keep support for RAUW of temporary MDNodes for use as
>    forward declarations (when parsing assembly or constructing cyclic
>    graphs), drop RAUW support for all other metadata.
This seems fine to me, a corollary of this should be that MDNodes never “move
around” in memory due to late uniquing etc, which means that TrackingVH
shouldn’t be necessary, right?  This should make all frontends a lot more memory
efficient because they can just use raw pointers to MDNodes everywhere.
> 2. No more function-local metadata.
This was a great idea that never mattered, I’m happy to drop it for the greater
good :-)

Some comments on the patch-in-progress:

+++ b/include/llvm/IR/MetadataV2.h
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/PointerUnion.h"
+#include "llvm/ADT/StringMap.h"
+#include "llvm/IR/Value.h"
+#include "llvm/Support/ErrorHandling.h"

Please factor things to avoid including heavy headers like DenseMap and
StringMap (e.g. by moving the Impl stuff out).

+class Metadata {
+private:
+  LLVMContext &Context;

Out of curiosity, why do Metadata nodes all need to carry around a context? 
This seems like memory bloat that would be great to avoid if possible.

+template <class T> class TypedMDRef : public MDRef {

Is this a safe way to model this, should it use private inheritance instead? 
Covariance seems like it would break this: specifically something could pass off
a typed MDRef as an MDRef&, then the client could reassign something of the
wrong type into it.

+/// \note This is really cheap and easy to support, and it's useful for
+/// implementing:

When this makes more progress, the comments should explain what it does and how
it works, out of context of the patch.

+  MDString(LLVMContext *Context) : Metadata(*Context, MDStringKind) {
+    assert(Context && "Expected non-null context");
...
+  static MDStringRef get(LLVMContext &Context, StringRef String);

You have reference/pointer disagreement, I’d recommend taking LLVMContext&
to the ctor for uniformity (and remove the assert).

+class ReplaceableMetadataImpl {

I’m not following your algorithm intentions well yet, but this seems like a
really heavy-weight implementation, given that this is a common base class for a
number of other important things.

+class ValueAsMetadata : public Metadata, ReplaceableMetadataImpl {

I think that ValueAsMetadata is a great thing to have - it is essential to refer
to global variables etc.  That said, I think that it will eventually be
worthwhile to have metadata versions of integers and other common things to
reduce memory.  This should come in a later pass of course.

I don’t follow the point of
+class UniquableMDNode : public MDNode {

What would subclass it?  What is the use-case?  Won’t MDStrings continue to be
uniqued?

Overall, I’m thrilled to see this direction, thanks for pushing forward on it
Duncan!

-Chris

Duncan P. N. Exon Smith

2014-Nov-10 17:08 UTC

head link

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

> On 2014-Nov-09, at 22:24, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> ----- Original Message -----
>> From: "Duncan P. N. Exon Smith" <dexonsmith at
apple.com>
>> To: "LLVM Developers Mailing List" <llvmdev at
cs.uiuc.edu>
>> Sent: Sunday, November 9, 2014 7:02:43 PM
>> Subject: [LLVMdev] [RFC] Separating Metadata from the Value hierarchy
>> 
>> 
>> 
>> TL;DR: If you use metadata (especially if it's out-of-tree), check
>> the
>> numbered list of lost functionality below to see whether I'm trying
>> to
>> break your compiler permanently.
>> 
>> In response to my recent commits (e.g., [1]) that changed API from
>> `MDNode` to `Value`, Eric had a really interesting idea [2] -- split
>> metadata entirely from the `Value` hierarchy, and drop general
>> support
>> for metadata RAUW.
>> 
>> After hacking together some example code, this seems overwhelmingly
>> to
>> me like the right direction. See the attached metadata-v2.patch for
>> my
>> sketch of what the current metadata primitives might look like in the
>> new hierarchy (includes LLVMContextImpl uniquing support).
>> 
>> The initial motivation was to avoid the API weaknesses caused by
>> having
>> non-`MDNode` metadata that inherits from `Value`. In particular,
>> instead of changing API from `MDNode` to `Value`, change it to a base
>> class called `Metadata`, which sheds the underused and expensive
>> `Value`
>> base class entirely.
>> 
>> The implications are broader: an enormous reduction in complexity
>> (and
>> overhead) for metadata.
>> 
>> This change implies minor (major?) loss of functionality for
>> metadata,
>> but Eric and I suspect that the only hard-to-fix user is debug info
>> itself, whose IR infrastructure I'm rewriting anyway.
>> 
>> Here is what we'd lose:
>> 
>> 1. No more arbitrary RAUW of metadata.
>> 
>> While we'd keep support for RAUW of temporary MDNodes for use as
>> forward declarations (when parsing assembly or constructing cyclic
>> graphs), drop RAUW support for all other metadata.
> 
> So temporary MDNodes would be Values, but regular metadata would not be?
Will regular metadata nodes no longer have lists of users? If I have a
TrackingVH<MDNode> with temporary MDNodes, after I call
replaceAllUsesWith, what happens?
> 
> I'm specifically wondering how we'll need to update
CloneAliasScopeMetadata in lib/Transforms/Utils/InlineFunction.cpp, and I
don't see any fundamental problem with what you've proposed, but these
seem like generic upgrade questions.
> 
> Thanks again,
> Hal
So, none of them would be `Value`s.

`TempMDNode` supports its own, non-`Value`-based, RAUW via
`ReplaceableMetadataImpl` (`ValueAsMetadata` uses the same).

`CloneAliasScopeMetadata()` should use `TrackingMDRef` in place of
`TrackingVH<MDNode>`.  `TrackingMDRef` will register itself with the
`Metadata` if it's replaceable, and if/when it gets RAUW'ed, the pointer
will get updated.

If you have another look at the patch it might be more clear now.

BTW, another thing I added in that sketch was the option to explicitly
request a non-uniqued `MDNode`.  Haven't thought through details, but I
was specifically thinking this would be a cleaner way to create your
non-uniquable alias nodes (instead of the current self-referential
approach).

My working straw man for assembly syntax looks like this:

    !1 = metadata distinct !{ ... }

As an example, the alias "root" nodes could change from this:

    !1 = metadata !{ metadata !1 }

to this:

    !1 = metadata distinct !{}

Which means constructing them would change from this:

    MDNode *T = MDNode::getTemporary(Context, None)
    MDNode *N = MDNode::get(Context, {T});
    T->replaceAllUsesWith(N);
    MDNode::deleteTemporary(T);

to this:

    // In that patch it's "getNonUniqued()".
    MDNode *N = MDNode::getDistinct(Context, None);

Furthermore, if all references to the alias root got dropped, they'd
clean themselves up instead of keeping each other alive in a cycle.

Duncan P. N. Exon Smith

2014-Nov-10 18:17 UTC

head link

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

> On 2014-Nov-10, at 08:30, Chris Lattner <clattner at apple.com>
wrote:
> 
>> On Nov 9, 2014, at 5:02 PM, Duncan P. N. Exon Smith <dexonsmith at
apple.com> wrote:
>> In response to my recent commits (e.g., [1]) that changed API from
>> `MDNode` to `Value`, Eric had a really interesting idea [2] -- split
>> metadata entirely from the `Value` hierarchy, and drop general support
>> for metadata RAUW.
> 
> Wow, this never occurred to me, but in hindsight seems like obviously the
right direction.
Yup, good on Eric here.
>> Here is what we'd lose:
>> 
>> 1. No more arbitrary RAUW of metadata.
>> 
>>   While we'd keep support for RAUW of temporary MDNodes for use as
>>   forward declarations (when parsing assembly or constructing cyclic
>>   graphs), drop RAUW support for all other metadata.
> 
> This seems fine to me, a corollary of this should be that MDNodes never
“move around” in memory due to late uniquing etc, which means that TrackingVH
shouldn’t be necessary, right?  This should make all frontends a lot more memory
efficient because they can just use raw pointers to MDNodes everywhere.
Almost.

Two caveats:

 1. Handles should generally be `MDRef` instead of `Metadata*`, since I
    threw in reference-counting semantics so that no-longer-referenced
    metadata cleans itself up.

 2. If a handle might point to a forward reference -- i.e., a
    `TempMDNode` in this patch -- it should use `TrackingMDRef`.  When
    the forward reference gets resolved, it updates all of its tracking
    references.

Nevertheless, `sizeof(MDRef) == sizeof(TrackingMDRef) == sizeof(void*)`.
Frontends *only* pay memory overhead for the currently-unresolved
forward references.

(Maybe `MDNodeFwdRef` is a better name than `TempMDNode`?)
> 
>> 2. No more function-local metadata.
> 
> This was a great idea that never mattered, I’m happy to drop it for the
greater good :-)
> 
Awesome.
> +class Metadata {
> +private:
> +  LLVMContext &Context;
> 
> Out of curiosity, why do Metadata nodes all need to carry around a context?
This seems like memory bloat that would be great to avoid if possible.
There are two uses for the context:

  - RAUW.  Metadata that can RAUW (`TempMDNode` and `MetadataAsValue`)
    need a context.  Other metadata need access to a context when their
    operands get RAUW'ed (so they can re-unique themselves), but this
    could be passed in as an argument, so they don't need their own.
    
  - Reference counting.  My sketch uses reference counting for all the
    nodes, so they all need a context to delete themselves when their
    last reference gets dropped.

Customized ownership of debug info metadata will allow us to get the
context from parent pointers, so we might be able to remove this down
the road (depending on whether the `Metadata` subclass is uniqued).
> +template <class T> class TypedMDRef : public MDRef {
> 
> Is this a safe way to model this, should it use private inheritance
instead?  Covariance seems like it would break this: specifically something
could pass off a typed MDRef as an MDRef&, then the client could reassign
something of the wrong type into it.
Good point.
> +  MDString(LLVMContext *Context) : Metadata(*Context, MDStringKind) {
> +    assert(Context && "Expected non-null context");
> ...
> +  static MDStringRef get(LLVMContext &Context, StringRef String);
> 
> You have reference/pointer disagreement, I’d recommend taking
LLVMContext& to the ctor for uniformity (and remove the assert).
This is a workaround for `StringMapEntry`'s imperfect forwarding --
it passes the `InitVal` constructor argument by value.

I'll fix the problem at its source and clean this up.
> +class ReplaceableMetadataImpl {
> 
> I’m not following your algorithm intentions well yet, but this seems like a
really heavy-weight implementation, given that this is a common base class for a
number of other important things.
Unfortunately I think this weight will be hard to optimize out (although
I'm sure it could be a little smaller).

In the `Value` hierarchy, you pay memory for RAUW at the site of each
`Use`.  This makes sense, since most values can be RAUW'ed.

Since very little metadata needs to be RAUW'ed, we don't want to pay
memory at every use-site.  Instead, we pay the cost inside the
(hopefully) few instances that support RAUW.

The core assumption is that there are relatively few replaceable
metadata instances.  The only subclasses are `ValueAsMetadata` and
`TempMDNode`.  How many of these are live?

  - There will be at most one `ValueAsMetadata` instance for each
    metadata operand that points at a `Value`.  My data from a couple of
    months ago showed that there are very few of these.

  - `TempMDNodes` are used as forward references.  You only pay their
    cost until the forward reference gets resolved (i.e., deleted).

The main case I'm worried about here are references to `ConstantInt`s,
which are used pretty heavily outside of debug info (e.g., in `!prof`
attachments).  However, if that shows up in a profile, we can bypass
`Value`s entirely by creating a new `MDIntArray` subclass (or
something).
> +class ValueAsMetadata : public Metadata, ReplaceableMetadataImpl {
> 
> I think that ValueAsMetadata is a great thing to have - it is essential to
refer to global variables etc.  That said, I think that it will eventually be
worthwhile to have metadata versions of integers and other common things to
reduce memory.  This should come in a later pass of course.
Yup, agree entirely.
> 
> I don’t follow the point of
> +class UniquableMDNode : public MDNode {
> 
> What would subclass it?  What is the use-case?  Won’t MDStrings continue to
be uniqued?
I think you've just been misled by my terrible name.

This sketch splits the "temporary" concept for `MDNode` into a
separate
subclass, so that non-forward-reference `MDNode`s don't have to pay for
RAUW overhead.

The class hierarchy I envision looks something like this:

    Metadata
      MDNode
        TempMDNode      // MDNodeFwdRef?
        UniquableMDNode // GenericMDNode?
        DINode          // Is this layer useful?
          DILocation
          DIScope
            DIType
              DIBasicType
              DICompositeType
              ...
            DISubprogram
            ...
          DICompileUnit
          ...
      MDString
      ValueAsMetadata     

`UniquableMDNode` is a leaf-class that behaves like the current `MDNode`
(when it's not a temporary).  I called it "uniquable" because,
unlike
`TempMDNode`, it is uniqued by default (although you can opt-out of it,
and uniquing might be dropped).

Maybe a better name is `GenericMDNode`?

Also, off-topic, but after sketching out the imagined hierarchy above,
I'm not sure `DINode` is particularly useful.

Philip Reames

2014-Nov-10 22:13 UTC

head link

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

On 11/09/2014 05:02 PM, Duncan P. N. Exon Smith wrote:> TL;DR: If you use metadata (especially if it's out-of-tree), check the
> numbered list of lost functionality below to see whether I'm trying to
> break your compiler permanently.
>
> In response to my recent commits (e.g., [1]) that changed API from
> `MDNode` to `Value`, Eric had a really interesting idea [2] -- split
> metadata entirely from the `Value` hierarchy, and drop general support
> for metadata RAUW.
>
> After hacking together some example code, this seems overwhelmingly to
> me like the right direction.  See the attached metadata-v2.patch for my
> sketch of what the current metadata primitives might look like in the
> new hierarchy (includes LLVMContextImpl uniquing support).
>
> The initial motivation was to avoid the API weaknesses caused by having
> non-`MDNode` metadata that inherits from `Value`.  In particular,
> instead of changing API from `MDNode` to `Value`, change it to a base
> class called `Metadata`, which sheds the underused and expensive `Value`
> base class entirely.
>
> The implications are broader: an enormous reduction in complexity (and
> overhead) for metadata.
>
> This change implies minor (major?) loss of functionality for metadata,
> but Eric and I suspect that the only hard-to-fix user is debug info
> itself, whose IR infrastructure I'm rewriting anyway.I generally support this direction.  I do not know of any current use 
case which would be broken by this change.  I do have some hesitations 
about potential future use though. (see below)>
> Here is what we'd lose:
>
>   1. No more arbitrary RAUW of metadata.
>
>      While we'd keep support for RAUW of temporary MDNodes for use as
>      forward declarations (when parsing assembly or constructing cyclic
>      graphs), drop RAUW support for all other metadata.
>
>      Note that we'd also keep support for RAUW of `Value` operands of
>      metadata.
>
>      If the RAUW of an operand causes a uniquing collision, uniquing
>      would be dropped for that node.  This matches the current behaviour
>      when an operand goes to null.
>
>      Upgrade path: none.
>
>   2. No more function-local metadata.
>
>      AFAICT, function-local metadata is *only* used for indirect
>      references to instructions and arguments in `@llvm.dbg.value` and
>      `@llvm.dbg.declare` intrinsics.  The first argument of the following
>      is an example:
>
>          call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
>                                    metadata !1)I hesitate a bit here.  The current range metadata deals only with 
constant ranges, but I could see a day where encoding a range in terms 
of two other Values might be useful.  This is one obvious use case, but 
there may also be others.

I'm not going to oppose the proposal made, but if there was a way to 
preserve the ability to reference to function local SSA values, I'd 
slightly prefer that.>      Note that the debug info people uniformly seem to dislike the status
>      quo, since it's awkward to get from a `Value` to the corresponding
>      intrinsic.
>
>      Upgrade path: Instead of using an intrinsic that references a
>      function-local value through an `MDNode`, attach metadata to the
>      corresponding argument or instruction, or to the terminating
>      instruction of the basic block.  (This requires new support for
>      attaching metadata to function arguments, which I'll have to add
for
>      debug info anyway.)
>
> Is this going to break your compiler?  How?  Why is your use case worth
> supporting?
>
> [1]:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141103/242667.html
>      "r221167 - IR: MDNode => Value:
Instruction::getAllMetadataOtherThanDebugLoc()"
> [2]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-November/078581.html
>      "Re: First-class debug info IR: MDLocation"
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141110/b44da2d5/attachment.html>

Duncan P. N. Exon Smith

2014-Nov-11 00:08 UTC

head link

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

> On 2014-Nov-10, at 14:13, Philip Reames <listmail at
philipreames.com> wrote:
> 
>> 2. No more function-local metadata.
>> 
>>     AFAICT, function-local metadata is *only* used for indirect
>>     references to instructions and arguments in `@llvm.dbg.value` and
>>     `@llvm.dbg.declare` intrinsics.  The first argument of the
following
>>     is an example:
>> 
>>         call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
>>                                   metadata !1)
>> 
> I hesitate a bit here.  The current range metadata deals only with constant
ranges, but I could see a day where encoding a range in terms of two other
Values might be useful.  This is one obvious use case, but there may also be
others.
> 
> I'm not going to oppose the proposal made, but if there was a way to
preserve the ability to reference to function local SSA values, I'd slightly
prefer that.
In theory it's possible, but it adds a large complexity burden
because of the need to track whether metadata is local or global.
I strongly prefer dropping it until a compelling use-case appears.

Note that debug info is rather special in that it's *not allowed*
to modify optimizations or code generation.  For non-debug info,
you can reference local values directly using intrinsics, as (e.g.)
@llvm.assume [1] does.

[1]: http://llvm.org/docs/LangRef.html#llvm-assume-intrinsic

Duncan P. N. Exon Smith

2014-Nov-12 21:00 UTC

head link

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

If you don't care about function-local metadata and debug info
intrinsics, skip ahead to the section on assembly syntax in case you
have comments on that.
> On 2014-Nov-09, at 17:02, Duncan P. N. Exon Smith <dexonsmith at
apple.com> wrote:
> 
> 2. No more function-local metadata.
> 
>    AFAICT, function-local metadata is *only* used for indirect
>    references to instructions and arguments in `@llvm.dbg.value` and
>    `@llvm.dbg.declare` intrinsics.  The first argument of the following
>    is an example:
> 
>        call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
>                                  metadata !1)
> 
>    Note that the debug info people uniformly seem to dislike the status
>    quo, since it's awkward to get from a `Value` to the corresponding
>    intrinsic.
> 
>    Upgrade path: Instead of using an intrinsic that references a
>    function-local value through an `MDNode`, attach metadata to the
>    corresponding argument or instruction, or to the terminating
>    instruction of the basic block.  (This requires new support for
>    attaching metadata to function arguments, which I'll have to add for
>    debug info anyway.)
llvm::Argument attachments are hard
==================================
I've been looking at prototyping metadata attachments to
`llvm::Argument`, which is key to replacing debug info intrinsics.

It's a fairly big piece of new IR, and comes with its own subtle
semantic decisions.  What do you do with metadata attached to arguments
when you inline a function?  If the arguments are remapped to other
instructions (or arguments), they may have metadata of the same kind
attached.  Does one win?  Which one?  Or are they merged?  What if the
arguments get remapped to constants?  What about when a function is
cloned?

While the rest of this metadata-is-not-a-value proposal is effectively
NFC, this `Argument` part could introduce problems.  If I rewrite debug
info intrinsics as argument attachments and then immediately split
`Metadata` from `Value`, any semantic subtleties will be difficult to
diagnose in the noise of the rest of the changes.

While I was looking at this as a shortcut to avoid porting
function-local metadata, I think it introduces more points of failure
than problems it solves.

Limited function-local metadata
-------------------------------

Instead, I propose porting a limited form of function-local metadata,
whose use is severely restricted but covers our current use cases (keep
reading for details).  We can defer replacing debug info intrinsics
until the infrastructure has settled down and is stable.

Assembly syntax
==============
This is a good time to talk about assembly syntax, since it will
demonstrate what I'm thinking for function-local metadata.

Assembly syntax is important.  It's our view into the IR.  If metadata
is typeless (and not a `Value`), that should be reflected in the
assembly syntax.

Old syntax
----------

There are four places metadata can be used/reference in the IR.

 1. Operands of `MDNode`.

        !0 = metadata !{metadata !"string", metadata !1, i32* @global)

    Notice that the `@global` argument is not metadata: it's an
    `llvm::Constant`.  In the new IR, these will be wrapped in a
    `ValueAsMetadata` instance.

 2. Operands of `NamedMDNode` (yes, they're different).

        !named = metadata !{metadata !0, metadata !1}

    These operands are always `MDNode`.

 3. Attachments to instructions.

        %inst = load i32* @global, !dbg !0

    Notice that we already skip the `metadata` type here.

 4. Arguments to intrinsics.

        call void @llvm.dbg(metadata !{i32 %inst}, metadata !0)

    The first argument is subtle -- that's a function-local `MDNode`
    with `%inst` as its only operand.

    In the new IR, the second operand will be a `MetadataAsValue`
    instance that contains a reference to the `MDNode` numbered `!0`.

New syntax
----------

Types only make sense when an operand can be an `llvm::Value`.  Let's
remove them where they don't make sense.

I propose the following syntax for the above examples, using a new
keyword, `value`:

 1. Operands of `MDNode`.  Drop `metadata`, since metadata doesn't have
    types.  Use `value` to indicate a wrapped `llvm::Value`.

        !0 = !{!"string", !1, value i32* @global)

 2. Operands of `NamedMDNode`.  Drop `metadata`, since metadata doesn't
    have types.

        !named = !{!0, !1}

 3. Attachments to instructions.  No change!

        %inst = load i32* @global, !dbg !0

 4. Arguments to intrinsics.  Keep `metadata`, since here it's wrapped
    into an `llvm::Value` (which has a type).  Use `value` to indicate a
    metadata-wrapped value.

        call void @llvm.dbg(metadata value i32 %inst, metadata !0)

    Notice that the first argument doesn't use an `MDNode` anymore.

Restrictions on function-local metadata
======================================
In the new IR, function-local metadata (say, `LocalValueAsMetadata`)
*cannot* be used as an operand to metadata -- the only legal place for
it is in a `MetadataAsValue` instance.  This prevents the additional
complexity from poisoning the rest of the metadata hierarchy.

Effectively, this restricts function-local metadata to direct operands
of intrinsics.

Adrian Prantl

2014-Nov-13 19:17 UTC

head link

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

> On Nov 12, 2014, at 1:00 PM, Duncan P. N. Exon Smith <dexonsmith at
apple.com> wrote:
> 
> If you don't care about function-local metadata and debug info
> intrinsics, skip ahead to the section on assembly syntax in case you
> have comments on that.
> 
>> On 2014-Nov-09, at 17:02, Duncan P. N. Exon Smith <dexonsmith at
apple.com> wrote:
>> 
>> 2. No more function-local metadata.
>> 
>>   AFAICT, function-local metadata is *only* used for indirect
>>   references to instructions and arguments in `@llvm.dbg.value` and
>>   `@llvm.dbg.declare` intrinsics.  The first argument of the following
>>   is an example:
>> 
>>       call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
>>                                 metadata !1)
>> 
>>   Note that the debug info people uniformly seem to dislike the status
>>   quo, since it's awkward to get from a `Value` to the
corresponding
>>   intrinsic.
>> 
>>   Upgrade path: Instead of using an intrinsic that references a
>>   function-local value through an `MDNode`, attach metadata to the
>>   corresponding argument or instruction, or to the terminating
>>   instruction of the basic block.  (This requires new support for
>>   attaching metadata to function arguments, which I'll have to add
for
>>   debug info anyway.)
> 
> llvm::Argument attachments are hard
> ==================================> 
> I've been looking at prototyping metadata attachments to
> `llvm::Argument`, which is key to replacing debug info intrinsics.
> 
> It's a fairly big piece of new IR, and comes with its own subtle
> semantic decisions.  What do you do with metadata attached to arguments
> when you inline a function?  If the arguments are remapped to other
> instructions (or arguments), they may have metadata of the same kind
> attached.  Does one win?  Which one?  Or are they merged?  What if the
> arguments get remapped to constants?  What about when a function is
> cloned?
> 
> While the rest of this metadata-is-not-a-value proposal is effectively
> NFC, this `Argument` part could introduce problems.  If I rewrite debug
> info intrinsics as argument attachments and then immediately split
> `Metadata` from `Value`, any semantic subtleties will be difficult to
> diagnose in the noise of the rest of the changes.
> 
> While I was looking at this as a shortcut to avoid porting
> function-local metadata, I think it introduces more points of failure
> than problems it solves.

One thing to consider is that there are cases were we describe function
arguments without referencing the argument in the intrinsic at all. Currently,
if you compile

  $ cat struct.c
  struct s { int a; int b; };
  int foo(struct s s1) { return s1.a; }

  $ clang -g -O1 -arch x86_64 -S -emit-llvm struct.c

we cannot preserve the debug info for the argument so it is turned into an
intrinsic describing an undef value.

  ; Function Attrs: nounwind readnone ssp uwtable
  define i32 @foo(i64 %s1.coerce) #0 {
  entry:
    %s1.sroa.0.0.extract.trunc = trunc i64 %s1.coerce to i32
    tail call void @llvm.dbg.declare(metadata !18, metadata !14, metadata !19),
!dbg !20
    ret i32 %s1.sroa.0.0.extract.trunc, !dbg !21
  }

  !18 = metadata !{%struct.s* undef}

Note that it is critical for this DIVariable to make it into the debug info even
if it is undefined, or the function arguments won’t match the function
signature.

-- adrian
> 
> Limited function-local metadata
> -------------------------------
> 
> Instead, I propose porting a limited form of function-local metadata,
> whose use is severely restricted but covers our current use cases (keep
> reading for details).  We can defer replacing debug info intrinsics
> until the infrastructure has settled down and is stable.
> 
> Assembly syntax
> ==============> 
> This is a good time to talk about assembly syntax, since it will
> demonstrate what I'm thinking for function-local metadata.
> 
> Assembly syntax is important.  It's our view into the IR.  If metadata
> is typeless (and not a `Value`), that should be reflected in the
> assembly syntax.
> 
> Old syntax
> ----------
> 
> There are four places metadata can be used/reference in the IR.
> 
> 1. Operands of `MDNode`.
> 
>        !0 = metadata !{metadata !"string", metadata !1, i32*
@global)
> 
>    Notice that the `@global` argument is not metadata: it's an
>    `llvm::Constant`.  In the new IR, these will be wrapped in a
>    `ValueAsMetadata` instance.
> 
> 2. Operands of `NamedMDNode` (yes, they're different).
> 
>        !named = metadata !{metadata !0, metadata !1}
> 
>    These operands are always `MDNode`.
> 
> 3. Attachments to instructions.
> 
>        %inst = load i32* @global, !dbg !0
> 
>    Notice that we already skip the `metadata` type here.
> 
> 4. Arguments to intrinsics.
> 
>        call void @llvm.dbg(metadata !{i32 %inst}, metadata !0)
> 
>    The first argument is subtle -- that's a function-local `MDNode`
>    with `%inst` as its only operand.
> 
>    In the new IR, the second operand will be a `MetadataAsValue`
>    instance that contains a reference to the `MDNode` numbered `!0`.
> 
> New syntax
> ----------
> 
> Types only make sense when an operand can be an `llvm::Value`.  Let's
> remove them where they don't make sense.
> 
> I propose the following syntax for the above examples, using a new
> keyword, `value`:
> 
> 1. Operands of `MDNode`.  Drop `metadata`, since metadata doesn't have
>    types.  Use `value` to indicate a wrapped `llvm::Value`.
> 
>        !0 = !{!"string", !1, value i32* @global)
> 
> 2. Operands of `NamedMDNode`.  Drop `metadata`, since metadata doesn't
>    have types.
> 
>        !named = !{!0, !1}
> 
> 3. Attachments to instructions.  No change!
> 
>        %inst = load i32* @global, !dbg !0
> 
> 4. Arguments to intrinsics.  Keep `metadata`, since here it's wrapped
>    into an `llvm::Value` (which has a type).  Use `value` to indicate a
>    metadata-wrapped value.
> 
>        call void @llvm.dbg(metadata value i32 %inst, metadata !0)
> 
>    Notice that the first argument doesn't use an `MDNode` anymore.
> 
> Restrictions on function-local metadata
> ======================================> 
> In the new IR, function-local metadata (say, `LocalValueAsMetadata`)
> *cannot* be used as an operand to metadata -- the only legal place for
> it is in a `MetadataAsValue` instance.  This prevents the additional
> complexity from poisoning the rest of the metadata hierarchy.
> 
> Effectively, this restricts function-local metadata to direct operands
> of intrinsics.
>

Philip Reames

2014-Nov-13 21:34 UTC

head link

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

On 11/12/2014 01:00 PM, Duncan P. N. Exon Smith wrote:> If you don't care about function-local metadata and debug info
> intrinsics, skip ahead to the section on assembly syntax in case you
> have comments on that.
>
>> On 2014-Nov-09, at 17:02, Duncan P. N. Exon Smith <dexonsmith at
apple.com> wrote:
>>
>> 2. No more function-local metadata.
>>
>>     AFAICT, function-local metadata is *only* used for indirect
>>     references to instructions and arguments in `@llvm.dbg.value` and
>>     `@llvm.dbg.declare` intrinsics.  The first argument of the
following
>>     is an example:
>>
>>         call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
>>                                   metadata !1)
>>
>>     Note that the debug info people uniformly seem to dislike the
status
>>     quo, since it's awkward to get from a `Value` to the
corresponding
>>     intrinsic.
>>
>>     Upgrade path: Instead of using an intrinsic that references a
>>     function-local value through an `MDNode`, attach metadata to the
>>     corresponding argument or instruction, or to the terminating
>>     instruction of the basic block.  (This requires new support for
>>     attaching metadata to function arguments, which I'll have to
add for
>>     debug info anyway.)
> llvm::Argument attachments are hard
> ==================================>
> I've been looking at prototyping metadata attachments to
> `llvm::Argument`, which is key to replacing debug info intrinsics.
>
> It's a fairly big piece of new IR, and comes with its own subtle
> semantic decisions.  What do you do with metadata attached to arguments
> when you inline a function?  If the arguments are remapped to other
> instructions (or arguments), they may have metadata of the same kind
> attached.  Does one win?  Which one?  Or are they merged?  What if the
> arguments get remapped to constants?  What about when a function is
> cloned?
>
> While the rest of this metadata-is-not-a-value proposal is effectively
> NFC, this `Argument` part could introduce problems.  If I rewrite debug
> info intrinsics as argument attachments and then immediately split
> `Metadata` from `Value`, any semantic subtleties will be difficult to
> diagnose in the noise of the rest of the changes.
>
> While I was looking at this as a shortcut to avoid porting
> function-local metadata, I think it introduces more points of failure
> than problems it solves.
>
> Limited function-local metadata
> -------------------------------
>
> Instead, I propose porting a limited form of function-local metadata,
> whose use is severely restricted but covers our current use cases (keep
> reading for details).  We can defer replacing debug info intrinsics
> until the infrastructure has settled down and is stable.This seems entirely reasonable.

Long term, supporting metadata on arguments would be useful, but we 
should also have a broader discussion about the role of attributes and 
metadata before we do that.>
> Assembly syntax
> ==============>
> This is a good time to talk about assembly syntax, since it will
> demonstrate what I'm thinking for function-local metadata.
>
> Assembly syntax is important.  It's our view into the IR.  If metadata
> is typeless (and not a `Value`), that should be reflected in the
> assembly syntax.
>
> Old syntax
> ----------
>
> There are four places metadata can be used/reference in the IR.
>
>   1. Operands of `MDNode`.
>
>          !0 = metadata !{metadata !"string", metadata !1, i32*
@global)
>
>      Notice that the `@global` argument is not metadata: it's an
>      `llvm::Constant`.  In the new IR, these will be wrapped in a
>      `ValueAsMetadata` instance.
>
>   2. Operands of `NamedMDNode` (yes, they're different).
>
>          !named = metadata !{metadata !0, metadata !1}
>
>      These operands are always `MDNode`.
>
>   3. Attachments to instructions.
>
>          %inst = load i32* @global, !dbg !0
>
>      Notice that we already skip the `metadata` type here.
>
>   4. Arguments to intrinsics.
>
>          call void @llvm.dbg(metadata !{i32 %inst}, metadata !0)
>
>      The first argument is subtle -- that's a function-local `MDNode`
>      with `%inst` as its only operand.
>
>      In the new IR, the second operand will be a `MetadataAsValue`
>      instance that contains a reference to the `MDNode` numbered `!0`.
>
> New syntax
> ----------
>
> Types only make sense when an operand can be an `llvm::Value`.  Let's
> remove them where they don't make sense.Hm, how does this interact with range metadata?  Currently, the type of 
the values making up the range have to match the instruction they're 
attached to.  This seems like it could be a change in behaviour.  
Thinking about it, it doesn't seem like a bad change, but it is a 
change.  Are there other cases like this?>
> I propose the following syntax for the above examples, using a new
> keyword, `value`:
>
>   1. Operands of `MDNode`.  Drop `metadata`, since metadata doesn't
have
>      types.  Use `value` to indicate a wrapped `llvm::Value`.
>
>          !0 = !{!"string", !1, value i32* @global)
>
>   2. Operands of `NamedMDNode`.  Drop `metadata`, since metadata
doesn't
>      have types.
>
>          !named = !{!0, !1}
>
>   3. Attachments to instructions.  No change!
>
>          %inst = load i32* @global, !dbg !0
>
>   4. Arguments to intrinsics.  Keep `metadata`, since here it's wrapped
>      into an `llvm::Value` (which has a type).  Use `value` to indicate a
>      metadata-wrapped value.
>
>          call void @llvm.dbg(metadata value i32 %inst, metadata !0)
>
>      Notice that the first argument doesn't use an `MDNode` anymore.
>
> Restrictions on function-local metadata
> ======================================>
> In the new IR, function-local metadata (say, `LocalValueAsMetadata`)
> *cannot* be used as an operand to metadata -- the only legal place for
> it is in a `MetadataAsValue` instance.  This prevents the additional
> complexity from poisoning the rest of the metadata hierarchy.
>
> Effectively, this restricts function-local metadata to direct operands
> of intrinsics.This seems entirely reasonable.

Philip

Eric Christopher

2014-Nov-14 22:11 UTC

head link

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

Seems reasonable. I'd love to get rid of the intrinsics but... yeah.

Thanks Duncan.

-eric

On Wed Nov 12 2014 at 1:00:21 PM Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:
> If you don't care about function-local metadata and debug info
> intrinsics, skip ahead to the section on assembly syntax in case you
> have comments on that.
>
> > On 2014-Nov-09, at 17:02, Duncan P. N. Exon Smith <dexonsmith at
apple.com>
> wrote:
> >
> > 2. No more function-local metadata.
> >
> >    AFAICT, function-local metadata is *only* used for indirect
> >    references to instructions and arguments in `@llvm.dbg.value` and
> >    `@llvm.dbg.declare` intrinsics.  The first argument of the
following
> >    is an example:
> >
> >        call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0,
> >                                  metadata !1)
> >
> >    Note that the debug info people uniformly seem to dislike the
status
> >    quo, since it's awkward to get from a `Value` to the
corresponding
> >    intrinsic.
> >
> >    Upgrade path: Instead of using an intrinsic that references a
> >    function-local value through an `MDNode`, attach metadata to the
> >    corresponding argument or instruction, or to the terminating
> >    instruction of the basic block.  (This requires new support for
> >    attaching metadata to function arguments, which I'll have to
add for
> >    debug info anyway.)
>
> llvm::Argument attachments are hard
> ==================================>
> I've been looking at prototyping metadata attachments to
> `llvm::Argument`, which is key to replacing debug info intrinsics.
>
> It's a fairly big piece of new IR, and comes with its own subtle
> semantic decisions.  What do you do with metadata attached to arguments
> when you inline a function?  If the arguments are remapped to other
> instructions (or arguments), they may have metadata of the same kind
> attached.  Does one win?  Which one?  Or are they merged?  What if the
> arguments get remapped to constants?  What about when a function is
> cloned?
>
> While the rest of this metadata-is-not-a-value proposal is effectively
> NFC, this `Argument` part could introduce problems.  If I rewrite debug
> info intrinsics as argument attachments and then immediately split
> `Metadata` from `Value`, any semantic subtleties will be difficult to
> diagnose in the noise of the rest of the changes.
>
> While I was looking at this as a shortcut to avoid porting
> function-local metadata, I think it introduces more points of failure
> than problems it solves.
>
> Limited function-local metadata
> -------------------------------
>
> Instead, I propose porting a limited form of function-local metadata,
> whose use is severely restricted but covers our current use cases (keep
> reading for details).  We can defer replacing debug info intrinsics
> until the infrastructure has settled down and is stable.
>
> Assembly syntax
> ==============>
> This is a good time to talk about assembly syntax, since it will
> demonstrate what I'm thinking for function-local metadata.
>
> Assembly syntax is important.  It's our view into the IR.  If metadata
> is typeless (and not a `Value`), that should be reflected in the
> assembly syntax.
>
> Old syntax
> ----------
>
> There are four places metadata can be used/reference in the IR.
>
>  1. Operands of `MDNode`.
>
>         !0 = metadata !{metadata !"string", metadata !1, i32*
@global)
>
>     Notice that the `@global` argument is not metadata: it's an
>     `llvm::Constant`.  In the new IR, these will be wrapped in a
>     `ValueAsMetadata` instance.
>
>  2. Operands of `NamedMDNode` (yes, they're different).
>
>         !named = metadata !{metadata !0, metadata !1}
>
>     These operands are always `MDNode`.
>
>  3. Attachments to instructions.
>
>         %inst = load i32* @global, !dbg !0
>
>     Notice that we already skip the `metadata` type here.
>
>  4. Arguments to intrinsics.
>
>         call void @llvm.dbg(metadata !{i32 %inst}, metadata !0)
>
>     The first argument is subtle -- that's a function-local `MDNode`
>     with `%inst` as its only operand.
>
>     In the new IR, the second operand will be a `MetadataAsValue`
>     instance that contains a reference to the `MDNode` numbered `!0`.
>
> New syntax
> ----------
>
> Types only make sense when an operand can be an `llvm::Value`.  Let's
> remove them where they don't make sense.
>
> I propose the following syntax for the above examples, using a new
> keyword, `value`:
>
>  1. Operands of `MDNode`.  Drop `metadata`, since metadata doesn't have
>     types.  Use `value` to indicate a wrapped `llvm::Value`.
>
>         !0 = !{!"string", !1, value i32* @global)
>
>  2. Operands of `NamedMDNode`.  Drop `metadata`, since metadata doesn't
>     have types.
>
>         !named = !{!0, !1}
>
>  3. Attachments to instructions.  No change!
>
>         %inst = load i32* @global, !dbg !0
>
>  4. Arguments to intrinsics.  Keep `metadata`, since here it's wrapped
>     into an `llvm::Value` (which has a type).  Use `value` to indicate a
>     metadata-wrapped value.
>
>         call void @llvm.dbg(metadata value i32 %inst, metadata !0)
>
>     Notice that the first argument doesn't use an `MDNode` anymore.
>
> Restrictions on function-local metadata
> ======================================>
> In the new IR, function-local metadata (say, `LocalValueAsMetadata`)
> *cannot* be used as an operand to metadata -- the only legal place for
> it is in a `MetadataAsValue` instance.  This prevents the additional
> complexity from poisoning the rest of the metadata hierarchy.
>
> Effectively, this restricts function-local metadata to direct operands
> of intrinsics.
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141114/9513a766/attachment.html>

Apparently Analagous Threads

Search for more seemingly similar threads

llvm dev - Nov 2014 - [LLVMdev] [RFC] Separating Metadata from the Value hierarchy

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

[LLVMdev] [RFC] Separating Metadata from the Value hierarchy

Apparently Analagous Threads