Adrian Prantl
2014-Nov-13 19:17 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
> On Nov 12, 2014, at 1:00 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: > > If you don't care about function-local metadata and debug info > intrinsics, skip ahead to the section on assembly syntax in case you > have comments on that. > >> On 2014-Nov-09, at 17:02, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: >> >> 2. No more function-local metadata. >> >> AFAICT, function-local metadata is *only* used for indirect >> references to instructions and arguments in `@llvm.dbg.value` and >> `@llvm.dbg.declare` intrinsics. The first argument of the following >> is an example: >> >> call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0, >> metadata !1) >> >> Note that the debug info people uniformly seem to dislike the status >> quo, since it's awkward to get from a `Value` to the corresponding >> intrinsic. >> >> Upgrade path: Instead of using an intrinsic that references a >> function-local value through an `MDNode`, attach metadata to the >> corresponding argument or instruction, or to the terminating >> instruction of the basic block. (This requires new support for >> attaching metadata to function arguments, which I'll have to add for >> debug info anyway.) > > llvm::Argument attachments are hard > ==================================> > I've been looking at prototyping metadata attachments to > `llvm::Argument`, which is key to replacing debug info intrinsics. > > It's a fairly big piece of new IR, and comes with its own subtle > semantic decisions. What do you do with metadata attached to arguments > when you inline a function? If the arguments are remapped to other > instructions (or arguments), they may have metadata of the same kind > attached. Does one win? Which one? Or are they merged? What if the > arguments get remapped to constants? What about when a function is > cloned? > > While the rest of this metadata-is-not-a-value proposal is effectively > NFC, this `Argument` part could introduce problems. If I rewrite debug > info intrinsics as argument attachments and then immediately split > `Metadata` from `Value`, any semantic subtleties will be difficult to > diagnose in the noise of the rest of the changes. > > While I was looking at this as a shortcut to avoid porting > function-local metadata, I think it introduces more points of failure > than problems it solves.One thing to consider is that there are cases were we describe function arguments without referencing the argument in the intrinsic at all. Currently, if you compile $ cat struct.c struct s { int a; int b; }; int foo(struct s s1) { return s1.a; } $ clang -g -O1 -arch x86_64 -S -emit-llvm struct.c we cannot preserve the debug info for the argument so it is turned into an intrinsic describing an undef value. ; Function Attrs: nounwind readnone ssp uwtable define i32 @foo(i64 %s1.coerce) #0 { entry: %s1.sroa.0.0.extract.trunc = trunc i64 %s1.coerce to i32 tail call void @llvm.dbg.declare(metadata !18, metadata !14, metadata !19), !dbg !20 ret i32 %s1.sroa.0.0.extract.trunc, !dbg !21 } !18 = metadata !{%struct.s* undef} Note that it is critical for this DIVariable to make it into the debug info even if it is undefined, or the function arguments won’t match the function signature. -- adrian> > Limited function-local metadata > ------------------------------- > > Instead, I propose porting a limited form of function-local metadata, > whose use is severely restricted but covers our current use cases (keep > reading for details). We can defer replacing debug info intrinsics > until the infrastructure has settled down and is stable. > > Assembly syntax > ==============> > This is a good time to talk about assembly syntax, since it will > demonstrate what I'm thinking for function-local metadata. > > Assembly syntax is important. It's our view into the IR. If metadata > is typeless (and not a `Value`), that should be reflected in the > assembly syntax. > > Old syntax > ---------- > > There are four places metadata can be used/reference in the IR. > > 1. Operands of `MDNode`. > > !0 = metadata !{metadata !"string", metadata !1, i32* @global) > > Notice that the `@global` argument is not metadata: it's an > `llvm::Constant`. In the new IR, these will be wrapped in a > `ValueAsMetadata` instance. > > 2. Operands of `NamedMDNode` (yes, they're different). > > !named = metadata !{metadata !0, metadata !1} > > These operands are always `MDNode`. > > 3. Attachments to instructions. > > %inst = load i32* @global, !dbg !0 > > Notice that we already skip the `metadata` type here. > > 4. Arguments to intrinsics. > > call void @llvm.dbg(metadata !{i32 %inst}, metadata !0) > > The first argument is subtle -- that's a function-local `MDNode` > with `%inst` as its only operand. > > In the new IR, the second operand will be a `MetadataAsValue` > instance that contains a reference to the `MDNode` numbered `!0`. > > New syntax > ---------- > > Types only make sense when an operand can be an `llvm::Value`. Let's > remove them where they don't make sense. > > I propose the following syntax for the above examples, using a new > keyword, `value`: > > 1. Operands of `MDNode`. Drop `metadata`, since metadata doesn't have > types. Use `value` to indicate a wrapped `llvm::Value`. > > !0 = !{!"string", !1, value i32* @global) > > 2. Operands of `NamedMDNode`. Drop `metadata`, since metadata doesn't > have types. > > !named = !{!0, !1} > > 3. Attachments to instructions. No change! > > %inst = load i32* @global, !dbg !0 > > 4. Arguments to intrinsics. Keep `metadata`, since here it's wrapped > into an `llvm::Value` (which has a type). Use `value` to indicate a > metadata-wrapped value. > > call void @llvm.dbg(metadata value i32 %inst, metadata !0) > > Notice that the first argument doesn't use an `MDNode` anymore. > > Restrictions on function-local metadata > ======================================> > In the new IR, function-local metadata (say, `LocalValueAsMetadata`) > *cannot* be used as an operand to metadata -- the only legal place for > it is in a `MetadataAsValue` instance. This prevents the additional > complexity from poisoning the rest of the metadata hierarchy. > > Effectively, this restricts function-local metadata to direct operands > of intrinsics. >
David Blaikie
2014-Nov-13 19:29 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
On Thu, Nov 13, 2014 at 11:17 AM, Adrian Prantl <aprantl at apple.com> wrote:> > > On Nov 12, 2014, at 1:00 PM, Duncan P. N. Exon Smith < > dexonsmith at apple.com> wrote: > > > > If you don't care about function-local metadata and debug info > > intrinsics, skip ahead to the section on assembly syntax in case you > > have comments on that. > > > >> On 2014-Nov-09, at 17:02, Duncan P. N. Exon Smith <dexonsmith at apple.com> > wrote: > >> > >> 2. No more function-local metadata. > >> > >> AFAICT, function-local metadata is *only* used for indirect > >> references to instructions and arguments in `@llvm.dbg.value` and > >> `@llvm.dbg.declare` intrinsics. The first argument of the following > >> is an example: > >> > >> call void @llvm.dbg.value(metadata !{i32 %val}, metadata !0, > >> metadata !1) > >> > >> Note that the debug info people uniformly seem to dislike the status > >> quo, since it's awkward to get from a `Value` to the corresponding > >> intrinsic. > >> > >> Upgrade path: Instead of using an intrinsic that references a > >> function-local value through an `MDNode`, attach metadata to the > >> corresponding argument or instruction, or to the terminating > >> instruction of the basic block. (This requires new support for > >> attaching metadata to function arguments, which I'll have to add for > >> debug info anyway.) > > > > llvm::Argument attachments are hard > > ==================================> > > > I've been looking at prototyping metadata attachments to > > `llvm::Argument`, which is key to replacing debug info intrinsics. > > > > It's a fairly big piece of new IR, and comes with its own subtle > > semantic decisions. What do you do with metadata attached to arguments > > when you inline a function? If the arguments are remapped to other > > instructions (or arguments), they may have metadata of the same kind > > attached. Does one win? Which one? Or are they merged? What if the > > arguments get remapped to constants? What about when a function is > > cloned? > > > > While the rest of this metadata-is-not-a-value proposal is effectively > > NFC, this `Argument` part could introduce problems. If I rewrite debug > > info intrinsics as argument attachments and then immediately split > > `Metadata` from `Value`, any semantic subtleties will be difficult to > > diagnose in the noise of the rest of the changes. > > > > While I was looking at this as a shortcut to avoid porting > > function-local metadata, I think it introduces more points of failure > > than problems it solves. > > > One thing to consider is that there are cases were we describe function > arguments without referencing the argument in the intrinsic at all. > Currently, if you compile > > $ cat struct.c > struct s { int a; int b; }; > int foo(struct s s1) { return s1.a; } > > $ clang -g -O1 -arch x86_64 -S -emit-llvm struct.c > > we cannot preserve the debug info for the argument so it is turned into an > intrinsic describing an undef value. > > ; Function Attrs: nounwind readnone ssp uwtable > define i32 @foo(i64 %s1.coerce) #0 { > entry: > %s1.sroa.0.0.extract.trunc = trunc i64 %s1.coerce to i32 > tail call void @llvm.dbg.declare(metadata !18, metadata !14, metadata > !19), !dbg !20 > ret i32 %s1.sroa.0.0.extract.trunc, !dbg !21 > } > > !18 = metadata !{%struct.s* undef} >I'm a little confused by this example - ideally in this case we wouldn't lose the variable, it's right there and we should reference it. But, yes, that'll take your SROA patch, etc.> Note that it is critical for this DIVariable to make it into the debug > info even if it is undefined, or the function arguments won’t match the > function signature. >In theory, at -O0 we should never "not have" an argument to attach something to, right? (that'd require DAE to run) and above -O0 we store the variable list so we can't lose variables anyway. Should we just aim for that? How far off are we before that's a practical goal (so we can avoid having to add in an intrinsic for this special case that we plan/hope/know will go away eventually). Alternatively, should we just turn on the variable list even at -O0, I know some other bugs that would 'fix' (fix in the sense of at least providing the right signature - but still be cases of missing variable location info). - David> > -- adrian > > > > > Limited function-local metadata > > ------------------------------- > > > > Instead, I propose porting a limited form of function-local metadata, > > whose use is severely restricted but covers our current use cases (keep > > reading for details). We can defer replacing debug info intrinsics > > until the infrastructure has settled down and is stable. > > > > Assembly syntax > > ==============> > > > This is a good time to talk about assembly syntax, since it will > > demonstrate what I'm thinking for function-local metadata. > > > > Assembly syntax is important. It's our view into the IR. If metadata > > is typeless (and not a `Value`), that should be reflected in the > > assembly syntax. > > > > Old syntax > > ---------- > > > > There are four places metadata can be used/reference in the IR. > > > > 1. Operands of `MDNode`. > > > > !0 = metadata !{metadata !"string", metadata !1, i32* @global) > > > > Notice that the `@global` argument is not metadata: it's an > > `llvm::Constant`. In the new IR, these will be wrapped in a > > `ValueAsMetadata` instance. > > > > 2. Operands of `NamedMDNode` (yes, they're different). > > > > !named = metadata !{metadata !0, metadata !1} > > > > These operands are always `MDNode`. > > > > 3. Attachments to instructions. > > > > %inst = load i32* @global, !dbg !0 > > > > Notice that we already skip the `metadata` type here. > > > > 4. Arguments to intrinsics. > > > > call void @llvm.dbg(metadata !{i32 %inst}, metadata !0) > > > > The first argument is subtle -- that's a function-local `MDNode` > > with `%inst` as its only operand. > > > > In the new IR, the second operand will be a `MetadataAsValue` > > instance that contains a reference to the `MDNode` numbered `!0`. > > > > New syntax > > ---------- > > > > Types only make sense when an operand can be an `llvm::Value`. Let's > > remove them where they don't make sense. > > > > I propose the following syntax for the above examples, using a new > > keyword, `value`: > > > > 1. Operands of `MDNode`. Drop `metadata`, since metadata doesn't have > > types. Use `value` to indicate a wrapped `llvm::Value`. > > > > !0 = !{!"string", !1, value i32* @global) > > > > 2. Operands of `NamedMDNode`. Drop `metadata`, since metadata doesn't > > have types. > > > > !named = !{!0, !1} > > > > 3. Attachments to instructions. No change! > > > > %inst = load i32* @global, !dbg !0 > > > > 4. Arguments to intrinsics. Keep `metadata`, since here it's wrapped > > into an `llvm::Value` (which has a type). Use `value` to indicate a > > metadata-wrapped value. > > > > call void @llvm.dbg(metadata value i32 %inst, metadata !0) > > > > Notice that the first argument doesn't use an `MDNode` anymore. > > > > Restrictions on function-local metadata > > ======================================> > > > In the new IR, function-local metadata (say, `LocalValueAsMetadata`) > > *cannot* be used as an operand to metadata -- the only legal place for > > it is in a `MetadataAsValue` instance. This prevents the additional > > complexity from poisoning the rest of the metadata hierarchy. > > > > Effectively, this restricts function-local metadata to direct operands > > of intrinsics. > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141113/91aa87a8/attachment.html>
Duncan P. N. Exon Smith
2014-Nov-13 20:06 UTC
[LLVMdev] [RFC] Separating Metadata from the Value hierarchy
> >> > llvm::Argument attachments are hard >> > ==================================>> > >> > I've been looking at prototyping metadata attachments to >> > `llvm::Argument`, which is key to replacing debug info intrinsics. >> > >> > It's a fairly big piece of new IR, and comes with its own subtle >> > semantic decisions. What do you do with metadata attached to arguments >> > when you inline a function? If the arguments are remapped to other >> > instructions (or arguments), they may have metadata of the same kind >> > attached. Does one win? Which one? Or are they merged? What if the >> > arguments get remapped to constants? What about when a function is >> > cloned? >> > >> > While the rest of this metadata-is-not-a-value proposal is effectively >> > NFC, this `Argument` part could introduce problems. If I rewrite debug >> > info intrinsics as argument attachments and then immediately split >> > `Metadata` from `Value`, any semantic subtleties will be difficult to >> > diagnose in the noise of the rest of the changes. >> > >> > While I was looking at this as a shortcut to avoid porting >> > function-local metadata, I think it introduces more points of failure >> > than problems it solves. >> >> >> One thing to consider is that there are cases were we describe function arguments without referencing the argument in the intrinsic at all. Currently, if you compile >> >> $ cat struct.c >> struct s { int a; int b; }; >> int foo(struct s s1) { return s1.a; } >> >> $ clang -g -O1 -arch x86_64 -S -emit-llvm struct.c >> >> we cannot preserve the debug info for the argument so it is turned into an intrinsic describing an undef value. >> >> ; Function Attrs: nounwind readnone ssp uwtable >> define i32 @foo(i64 %s1.coerce) #0 { >> entry: >> %s1.sroa.0.0.extract.trunc = trunc i64 %s1.coerce to i32 >> tail call void @llvm.dbg.declare(metadata !18, metadata !14, metadata !19), !dbg !20 >> ret i32 %s1.sroa.0.0.extract.trunc, !dbg !21 >> } >> >> !18 = metadata !{%struct.s* undef} > > I'm a little confused by this example - ideally in this case we wouldn't lose the variable, it's right there and we should reference it. But, yes, that'll take your SROA patch, etc. > > Note that it is critical for this DIVariable to make it into the debug info even if it is undefined, or the function arguments won’t match the function signature. > >> In theory, at -O0 we should never "not have" an argument to attach something to, right? (that'd require DAE to run) and above -O0 we store the variable list so we can't lose variables anyway. >> >> Should we just aim for that? How far off are we before that's a practical goal >> (so we can avoid having to add in an intrinsic for this special case that we plan/hope/know will go away eventuallyThis (new) proposal doesn't add any intrinsics, it just supports the current use cases.>> ). Alternatively, should we just turn on the variable list even at -O0, I know some other bugs that would 'fix' (fix in the sense of at least providing the right signature - but still be cases of missing variable location info).I think discussion about the debug info schema (and intrinsics) and their semantics is orthogonal to separating metadata from the `Value` hierarchy. At least, it's orthogonal now that I've found a design for function-local metadata that restricts it to intrinsic arguments, so it doesn't add complexity to the rest of the IR. IMO, it was a mistake for me to conflate the two in the first place; I was naive about the scope of changes required to replace the intrinsics.