Devang's work on debug info prompted this, thoughts welcome: http://nondot.org/sabre/LLVMNotes/ExtensibleMetadata.txt -Chris
On Fri, Sep 11, 2009 at 11:57 AM, Chris Lattner <clattner at apple.com> wrote:> The demand for metadata is even greater in non-C languages. For high level > scripting languages like python, it would be nice to be able to do static type > inference and annotate various pointers with type class information ("this is > a python string object", "this is a dictionary", whatever).So this particular metadata would be an extension of the type? And get propagated through as you create instructions that depend on other instructions which had the type metadata attached? I found myself wishing for such a thing several times recently. I ended up using a "type tag" of type [0 x opaque*] in my structs to force the type system to differentiate them from each other and make unique classes out of them. It works, but I need a separate hashtable to get back to a "class description" from a Value of a given Type.
On Sep 11, 2009, at 11:08 AM, Kenneth Uildriks wrote:> On Fri, Sep 11, 2009 at 11:57 AM, Chris Lattner <clattner at apple.com> > wrote: >> The demand for metadata is even greater in non-C languages. For >> high level >> scripting languages like python, it would be nice to be able to do >> static type >> inference and annotate various pointers with type class information >> ("this is >> a python string object", "this is a dictionary", whatever). > > So this particular metadata would be an extension of the type? And > get propagated through as you create instructions that depend on other > instructions which had the type metadata attached?No, this would be a property of the operation. In a dynamically typed language like python (and many others) a naive translation will turn all python objects into "void*"s. However, with some static or dynamic analysis, many types can be guessed at or inferred. This is a property of various operations, not about "void*". -Chris
I've got a suggestion for a refinement: Right now, we have classes like DILocation that wrap an MDNode* to provide more convenient access to it. It would be more convenient for users if instead of MDNode *DbgInfo = Inst->getMD(MDKind::DbgTag); Inst2->setMD(MDKind::DbgTag, DbgInfo); they could write: DILocation DbgInfo = Inst->getMD<DILocation>(); inst2->setMD(DbgInfo); we'd use TheContext->RegisterMDKind<MyKindWrapper>() or ...Register<MyKindWrapper>("name"); to register a new kind. (I prefer the first.) These kind wrappers need a couple methods to make them work: const StringRef KindWrapper::name("..."); KindWrapper(MDNode*); // Except for special cases like LdStAlign. KindWrapper::operator bool() {return mdnode!=NULL;} // ?? int StaticTypeId<KindWrapper>::value; // Used for the proposal's MDKind KindWrapper::ValidOnValue(const Value*); MDNode* KindWrapper::merge(MDNode*, MDNode*) // For the optimizers StaticTypeId is a new class that maps each of its template arguments to a small, unique integer, which may be different in different executions. Since the optimizers may want more methods over time, but we don't really want to require users to extend their wrappers, we should say that all wrappers must inherit from a particular type. I'd name this type "MDKind" and rename the proposed MDKind to MDKindID. Then we can add defaults to MDKind over time. Nothing needs to be virtual since these types are all used as template arguments. We could either use a global list of IDs for the MDKinds or have separate lists for each Context. StaticTypeId can only provide a global list, so giving each Context its own list would take an extra lookup, and wouldn't provide any benefit I can see. Chris mentioned that .bc files would store the mapping from name->ID, so the fact that StaticTypeId changes its values between runs isn't a problem. Thoughts? On Fri, Sep 11, 2009 at 9:57 AM, Chris Lattner <clattner at apple.com> wrote:> Devang's work on debug info prompted this, thoughts welcome: > http://nondot.org/sabre/LLVMNotes/ExtensibleMetadata.txt > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
On Sep 11, 2009, at 9:57 AM, Chris Lattner wrote:> Devang's work on debug info prompted this, thoughts welcome: > http://nondot.org/sabre/LLVMNotes/ExtensibleMetadata.txtThe document mentions "instructions" a lot. We'll want to be able to apply metadata to ConstantExprs as well at least, if not also Arguments (think noalias) and other stuff, so it seems best to just talk about "values" instead, and DenseMap<Value *, ...> instead of DenseMap<Instruction *, ...>. Dan
On Friday 11 September 2009 15:20, Jeffrey Yasskin wrote:> I've got a suggestion for a refinement: > > Right now, we have classes like DILocation that wrap an MDNode* to > provide more convenient access to it. It would be more convenient for > users if instead of > > MDNode *DbgInfo = Inst->getMD(MDKind::DbgTag); > Inst2->setMD(MDKind::DbgTag, DbgInfo); > > they could write: > > DILocation DbgInfo = Inst->getMD<DILocation>(); > inst2->setMD(DbgInfo); > > we'd use TheContext->RegisterMDKind<MyKindWrapper>() or > ...Register<MyKindWrapper>("name"); to register a new kind. (I prefer > the first.)Yes, this is very convenient. This along with the rest of Chris' proposal is very similar to the way we handled metadata in a compiler I worked on years ago. It was so useful we even used it to stash dataflow information away as we did analysis. Of course we had metatadat tagged on control structures as well. I'd like to see the currently proposal extended to other constructs as Chris notes.> StaticTypeId is a new class that maps each of its template arguments > to a small, unique integer, which may be different in different > executions.How does this work across compilation units? How about with shared LLVM libraries? These kinds of global unique IDs are notoriously difficult to get right. I'd suggest using a third-party unique-id library. Boost.UUID is one possibility but not the only one. I have a few questions and comments about Chris' initial proposal as well. - I don't like the separation between "built-in" metadata and "extended" metadata. Why not make all metadata use the RegisterMDKind interface and just have the LLVM libraries do it automatically for the "built-in" stuff? Having a separate namespace of enums is going to get confusing. Practically every day I curse the fact that "int" is different than "MyInt" in C++. :-/ - Defaulting alignment to 1 when metatadata is not present is going to be a huge performance hit on many architectures. I hope we can find a better solution. I'm not sure what it is yet because we have to maintain safety. I just fear a Pass inadvertantly dropping metadata and really screwing things up. This looks very promising! -Dave
On Sep 11, 2009, at 2:11 PM, Dan Gohman wrote:> > On Sep 11, 2009, at 9:57 AM, Chris Lattner wrote: > > >> Devang's work on debug info prompted this, thoughts welcome: >> http://nondot.org/sabre/LLVMNotes/ExtensibleMetadata.txt > > The document mentions "instructions" a lot. We'll want to be able to > apply metadata to ConstantExprs as well at least, if not also > Arguments > (think noalias) and other stuff, so it seems best to just talk about > "values" instead, and DenseMap<Value *, ...> instead of > DenseMap<Instruction *, ...>.I wrote: "Note that this document talks about metadata for instructions, it might make sense to generalize this to being metadata for all non-uniqued values (global variables, functions, basic blocks, arguments), but I'm just keeping it simple for now." However, constant exprs are uniqued. What would you find it useful for? -Chris
Dan Gohman wrote:> On Sep 11, 2009, at 9:57 AM, Chris Lattner wrote: > > >> Devang's work on debug info prompted this, thoughts welcome: >> http://nondot.org/sabre/LLVMNotes/ExtensibleMetadata.txt > > The document mentions "instructions" a lot. We'll want to be able to > apply metadata to ConstantExprs as well at least, if not also Arguments > (think noalias) and other stuff, so it seems best to just talk about > "values" instead, and DenseMap<Value *, ...> instead of > DenseMap<Instruction *, ...>.I'm wondering that too. Can we replace LLVM function attributes with metadata? There's been some pushback to adding new function attributes in the past and it would be nice to be able to prototype new ones without having to change all of the vm core. Nick
On Fri, Sep 11, 2009 at 1:20 PM, Jeffrey Yasskin <jyasskin at google.com> wrote:> I've got a suggestion for a refinement: > > Right now, we have classes like DILocation that wrap an MDNode* to > provide more convenient access to it. It would be more convenient for > users if instead of > > MDNode *DbgInfo = Inst->getMD(MDKind::DbgTag); > Inst2->setMD(MDKind::DbgTag, DbgInfo); > > they could write: > > DILocation DbgInfo = Inst->getMD<DILocation>(); > inst2->setMD(DbgInfo);IMO, there is not any need to add two llvm::Instruction methods, getMD and setMD. The metadata associated with an instruction will be store separately anyway. - Devang> > we'd use TheContext->RegisterMDKind<MyKindWrapper>() or > ...Register<MyKindWrapper>("name"); to register a new kind. (I prefer > the first.) > > These kind wrappers need a couple methods to make them work: > > const StringRef KindWrapper::name("..."); > KindWrapper(MDNode*); // Except for special cases like LdStAlign. > KindWrapper::operator bool() {return mdnode!=NULL;} // ?? > int StaticTypeId<KindWrapper>::value; // Used for the proposal's MDKind > KindWrapper::ValidOnValue(const Value*); > MDNode* KindWrapper::merge(MDNode*, MDNode*) // For the optimizers > > StaticTypeId is a new class that maps each of its template arguments > to a small, unique integer, which may be different in different > executions. > > Since the optimizers may want more methods over time, but we don't > really want to require users to extend their wrappers, we should say > that all wrappers must inherit from a particular type. I'd name this > type "MDKind" and rename the proposed MDKind to MDKindID. Then we can > add defaults to MDKind over time. Nothing needs to be virtual since > these types are all used as template arguments. > > We could either use a global list of IDs for the MDKinds or have > separate lists for each Context. StaticTypeId can only provide a > global list, so giving each Context its own list would take an extra > lookup, and wouldn't provide any benefit I can see. > > Chris mentioned that .bc files would store the mapping from name->ID, > so the fact that StaticTypeId changes its values between runs isn't a > problem. > > Thoughts? > > On Fri, Sep 11, 2009 at 9:57 AM, Chris Lattner <clattner at apple.com> wrote: >> Devang's work on debug info prompted this, thoughts welcome: >> http://nondot.org/sabre/LLVMNotes/ExtensibleMetadata.txt >> >> -Chris >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- - Devang
On Fri, Sep 11, 2009 at 1:20 PM, Jeffrey Yasskin <jyasskin at google.com> wrote:> I've got a suggestion for a refinement: > > Right now, we have classes like DILocation that wrap an MDNode* to > provide more convenient access to it. It would be more convenient for > users if instead of > > MDNode *DbgInfo = Inst->getMD(MDKind::DbgTag); > Inst2->setMD(MDKind::DbgTag, DbgInfo); > > they could write: > > DILocation DbgInfo = Inst->getMD<DILocation>(); > inst2->setMD(DbgInfo); > > we'd use TheContext->RegisterMDKind<MyKindWrapper>() or > ...Register<MyKindWrapper>("name"); to register a new kind. (I prefer > the first.)Right now, I am preparing a very simple implementation that allows us to make progress on debug info front. And the same time, it'd be possible for someone to extend it for other uses. - Devang
On Fri, Sep 11, 2009 at 9:57 AM, Chris Lattner <clattner at apple.com> wrote:> Devang's work on debug info prompted this, thoughts welcome: > http://nondot.org/sabre/LLVMNotes/ExtensibleMetadata.txtHere is partial initial implementation. Thoughts ? - Devang -------------- next part -------------- A non-text attachment was scrubbed... Name: mdn.diff Type: application/octet-stream Size: 7860 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090915/3b4fb271/attachment.obj>
On Tue, Sep 15, 2009 at 11:37 PM, Devang Patel <devang.patel at gmail.com> wrote:> On Fri, Sep 11, 2009 at 9:57 AM, Chris Lattner <clattner at apple.com> wrote: >> Devang's work on debug info prompted this, thoughts welcome: >> http://nondot.org/sabre/LLVMNotes/ExtensibleMetadata.txt > > Here is partial initial implementation. > Thoughts ?setHasMetadata probably shouldn't be public, like there's no way to directly set HasValueHandle. I'd make the MDKindId type (or typedef) now so it's obvious what those "unsigned"s mean. Should the metadata name->id map be per-context or global? I think it's unlikely people will be registering these things during compilation, so we can afford the overhead of a lock, and making it global helps with the wrapper types (avoids one vector lookup). But I can also change it when I add the wrapper types. It's too bad we don't have a SmallMap class to give MDInfoTy a good API and make it efficient at both small and large map sizes. Maybe s/MDInfoTy/MDMapTy/ to point out that it's a map from MDKindIds to MDNodes? Comment whether RegisterMDKind requires that its argument is as-yet-unknown in Metadata.h. Context.pImpl->TheMetadata.ValueIsDeleted(this); in ~Value will always call the empty ValueIsDeleted(Value*) overload. Please write a unittest for this. Metadata::getMDKind(Name) can just be "return MDHandlerNames.lookup(Name)" Could you comment what the WeakVH in MDPairTy is following? I got confused and thought it was tracking the Instruction holding the metadata rather than the MDNode. You did say this was partial; sorry if I pointed out anything you were already planning to do. Jeffrey
On Sep 15, 2009, at 11:37 PM, Devang Patel wrote:> Here is partial initial implementation.+ /// setMD - Atttach the metadata of given kind with an Instruction. + void setMD(unsigned MDKind, MDNode *Node, Instruction *Inst); Spelling: Attach. +/// setMD - Atttach the metadata of given kind with an Instruction. +void Metadata::setMD(unsigned MDKind, MDNode *Node, Instruction *Inst) { Spelling: Attach.