Hi all, With C++'s ODR, we are able to unique C++ types by using type identifiers to refer to types. Type identifiers are generated by C++ mangler. What about languages without ODR? Should we unique C types as well? One solution for C types is to generate a cross-CU unique identifier for C types. And before linking, we update all type identifiers in a source module with the corresponding hash of the C types, then linking can continue as usual. This requires clang to generate a cross-CU unique identifier for C types (one simple scheme is using a identifier that is unique within the CU and concatenating the CU's file name). And it also requires hashing of C types at DebugInfo IR level. We can add an API such as updateTypeIdentifiers(Module *), linker can call it right before linking in a source module. This is a preliminary design to start discussion. Comments and feedback are welcome. Thanks, Manman -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131011/ac590733/attachment.html>
Eric Christopher
2013-Oct-11 18:48 UTC
[LLVMdev] [Debug Info + LTO] Type Uniquing for C types?
> With C++'s ODR, we are able to unique C++ types by using type identifiers to > refer to types. > Type identifiers are generated by C++ mangler. What about languages without > ODR? Should we unique C types as well? >We can, but the identifier will need to be constructed on, likely, a language dependent basis to ensure uniqueness.> One solution for C types is to generate a cross-CU unique identifier for C > types. And before linking, we update all type identifiers in a source module > with the corresponding hash of the C types, then linking can continue as > usual. >Yes.> This requires clang to generate a cross-CU unique identifier for C types > (one simple scheme is using a identifier that is unique within the CU and > concatenating the CU's file name). And it also requires hashing of C types > at DebugInfo IR level. We can add an API such as > updateTypeIdentifiers(Module *), linker can call it right before linking in > a source module. >I think the easiest design you'll get for uniquing C types that are named the same thing (i.e. type defined in a .h file) is to use the name of the struct combined with the file (and possibly line/column) as an identifier. If you want to unify by structure then you'll need to do something the equivalent to the type hashing that we're implementing in the back end, but that'll be more difficult to construct via the front end - it may be possible though. -eric
Eric Christopher
2013-Oct-11 18:54 UTC
[LLVMdev] [Debug Info + LTO] Type Uniquing for C types?
> > I think the easiest design you'll get for uniquing C types that are > named the same thing (i.e. type defined in a .h file) is to use the > name of the struct combined with the file (and possibly line/column) > as an identifier. If you want to unify by structure then you'll need > to do something the equivalent to the type hashing that we're > implementing in the back end, but that'll be more difficult to > construct via the front end - it may be possible though. >To sum up in a slightly better way I think Doug has posted some rules on how to merge C types for modules and we could use those to construct a unique identifier for the type. If we do that I'd request we prepend the type name in there some how as that'd be convenient. :) -eric
On Fri, Oct 11, 2013 at 11:48 AM, Eric Christopher <echristo at gmail.com>wrote:> > With C++'s ODR, we are able to unique C++ types by using type > identifiers to > > refer to types. > > Type identifiers are generated by C++ mangler. What about languages > without > > ODR? Should we unique C types as well? > > > > We can, but the identifier will need to be constructed on, likely, a > language dependent basis to ensure uniqueness. > > > One solution for C types is to generate a cross-CU unique identifier for > C > > types. And before linking, we update all type identifiers in a source > module > > with the corresponding hash of the C types, then linking can continue as > > usual. > > > > Yes. > > > This requires clang to generate a cross-CU unique identifier for C types > > (one simple scheme is using a identifier that is unique within the CU and > > concatenating the CU's file name). And it also requires hashing of C > types > > at DebugInfo IR level. We can add an API such as > > updateTypeIdentifiers(Module *), linker can call it right before linking > in > > a source module. > > > > I think the easiest design you'll get for uniquing C types that are > named the same thing (i.e. type defined in a .h file) is to use the > name of the struct combined with the file (and possibly line/column) > as an identifier.Since we don't have ODR, we may have macros defined differently for a struct in a .h file, thus having two versions of the struct from two different CU. It seems that we can't assume structs with the same name and defined in the same file/line/column are the same.> If you want to unify by structure then you'll need > to do something the equivalent to the type hashing that we're > implementing in the back end, but that'll be more difficult to > construct via the front end - it may be possible though. >Hashing the types can happen either at the front end or at IR level. That is our first design choice :) I think we should try not to hash the types for non-LTO builds at the front end or at IR level, since it does not give us any benefit given that we are hashing them at the back end. One advantage of hashing it at IR level is that we can just hash the MDNodes that affect the type MDNode, at front end, the AST contains more information and should be harder to hash. Thanks, Manman> > -eric >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131011/fe4f9f9c/attachment.html>
Possibly Parallel Threads
- [LLVMdev] [Debug Info + LTO] Type Uniquing for C types?
- [LLVMdev] [Debug Info + LTO] Type Uniquing for C types?
- [LLVMdev] [Debug Info + LTO] Type Uniquing for C types?
- [LLVMdev] [Debug Info + LTO] Type Uniquing for C types?
- [LLVMdev] [Debug Info + LTO] Type Uniquing for C types?