On Friday 11 September 2009 15:20, Jeffrey Yasskin wrote:> I've got a suggestion for a refinement: > > Right now, we have classes like DILocation that wrap an MDNode* to > provide more convenient access to it. It would be more convenient for > users if instead of > > MDNode *DbgInfo = Inst->getMD(MDKind::DbgTag); > Inst2->setMD(MDKind::DbgTag, DbgInfo); > > they could write: > > DILocation DbgInfo = Inst->getMD<DILocation>(); > inst2->setMD(DbgInfo); > > we'd use TheContext->RegisterMDKind<MyKindWrapper>() or > ...Register<MyKindWrapper>("name"); to register a new kind. (I prefer > the first.)Yes, this is very convenient. This along with the rest of Chris' proposal is very similar to the way we handled metadata in a compiler I worked on years ago. It was so useful we even used it to stash dataflow information away as we did analysis. Of course we had metatadat tagged on control structures as well. I'd like to see the currently proposal extended to other constructs as Chris notes.> StaticTypeId is a new class that maps each of its template arguments > to a small, unique integer, which may be different in different > executions.How does this work across compilation units? How about with shared LLVM libraries? These kinds of global unique IDs are notoriously difficult to get right. I'd suggest using a third-party unique-id library. Boost.UUID is one possibility but not the only one. I have a few questions and comments about Chris' initial proposal as well. - I don't like the separation between "built-in" metadata and "extended" metadata. Why not make all metadata use the RegisterMDKind interface and just have the LLVM libraries do it automatically for the "built-in" stuff? Having a separate namespace of enums is going to get confusing. Practically every day I curse the fact that "int" is different than "MyInt" in C++. :-/ - Defaulting alignment to 1 when metatadata is not present is going to be a huge performance hit on many architectures. I hope we can find a better solution. I'm not sure what it is yet because we have to maintain safety. I just fear a Pass inadvertantly dropping metadata and really screwing things up. This looks very promising! -Dave
On Fri, Sep 11, 2009 at 3:23 PM, David Greene <dag at cray.com> wrote:> On Friday 11 September 2009 15:20, Jeffrey Yasskin wrote: >> StaticTypeId is a new class that maps each of its template arguments >> to a small, unique integer, which may be different in different >> executions. > > How does this work across compilation units? How about with shared LLVM > libraries? These kinds of global unique IDs are notoriously difficult > to get right. I'd suggest using a third-party unique-id library. Boost.UUID > is one possibility but not the only one.template<typename T> class StaticTypeId { static int id; } extern int NextStaticTypeId; // Initialized to 0. Possibly an atomic type instead. template<typename T> int StaticTypeId<T>::id = NextStaticTypeId++; This relies on the compiler uniquing static member variables across translation units, and I've never tested that across shared library boundaries. The initializer didn't work with gcc-2 (there was a workaround), but I believe it works with gcc-4. I've never tested it with MSVC. We can also use static local variables, which would have a different set of bugs, but they're very slightly slower to access. Since there's a registration step, we could also use Pass-style IDs, and have the registration fill them in, which would avoid uniquing problems.
On Sep 11, 2009, at 3:23 PM, David Greene wrote:> I have a few questions and comments about Chris' initial proposal as > well. > > - I don't like the separation between "built-in" metadata and > "extended" > metadata. Why not make all metadata use the RegisterMDKind > interface and > just have the LLVM libraries do it automatically for the "built-in" > stuff? > Having a separate namespace of enums is going to get confusing. > Practically > every day I curse the fact that "int" is different than "MyInt" in C > ++. :-/"builtin" metadata would also be registered, the only magic would be that the encoding would be smaller in the IR.> - Defaulting alignment to 1 when metatadata is not present is going > to be a > huge performance hit on many architectures. I hope we can find a > better > solution. I'm not sure what it is yet because we have to maintain > safety. > I just fear a Pass inadvertantly dropping metadata and really > screwing > things up.I don't expect metadata to be commonly stripped. This could be just as bad a perf problem for other things like TBAA or high level type information for a dynamic language. I think it is important that the IR is possible to reason about even in uncommon cases though. -Chris
On Friday 11 September 2009 19:22, Chris Lattner wrote:> > - I don't like the separation between "built-in" metadata and > > "extended" > > metadata. Why not make all metadata use the RegisterMDKind > > interface and > > just have the LLVM libraries do it automatically for the "built-in" > > stuff? > > Having a separate namespace of enums is going to get confusing. > > Practically > > every day I curse the fact that "int" is different than "MyInt" in C > > ++. :-/ > > "builtin" metadata would also be registered, the only magic would be > that the encoding would be smaller in the IR.Except the API is different. Built-in types use a well-known enum value not available to extended metadata. I have no problem with a smaller IR encoding. It's the programming interface I'm concerned about. I'd rather it be the same for everything.> I don't expect metadata to be commonly stripped. This could be just > as bad a perf problem for other things like TBAA or high level type > information for a dynamic language. I think it is important that the > IR is possible to reason about even in uncommon cases though.Sure. Just something we need to be aware of. -Dave
On Friday 11 September 2009 19:15, Jeffrey Yasskin wrote:> This relies on the compiler uniquing static member variables across > translation units, and I've never tested that across shared library > boundaries. The initializer didn't work with gcc-2 (there was a > workaround), but I believe it works with gcc-4. I've never tested it > with MSVC. We can also use static local variables, which would have a > different set of bugs, but they're very slightly slower to access.Shared libraries are the big problem. I know the Boost guys had endless discussions about how to design a Singleton to work in the presence of shared libraries and this is pretty close to the same problem.> Since there's a registration step, we could also use Pass-style IDs, > and have the registration fill them in, which would avoid uniquing > problems.Yes, I think that should work. Doing things with static initializer magic is asking for trouble. -Dave