Hi all, We've run into a tricky situation in our work on the Crack compiler and I'm hoping that someone on this list can help us find the best solution. We're currently trying to implement "module caching" for Crack, similar to the feature in Python where module bitcode is persisted at compile time. When we import a module, before compiling the module we check for a bitcode file matching the source -- if a bitcode file exists, we load it and extract compile-time metadata from it, saving us a compile step. The problem here is that the StructType objects in the cached module are different from those referenced by the compiler, so when we try to reference entities in the cached module we get an assertion failure due to type incompatibility. Specifically, this is currently happening when we reference a global variable in the cached module from an array initializer in the importing module which is still being compiled. We can't easily just use the linker to manage all of this because we still want to be able to persist that new module independently for later use. There may be other reasons, too: seperate modules is a fundamental assumption of our design. We currently use the linker only at the end of an AOT build. So from what I can see, our possible solutions are: 1) duplicate the LinkModule internal code and copy the module we load from bitcode to a new module with the correct types mapped. 2) duplicate BitcodeReader and create a version that reuses existing StructTypes. 3) destructively convert all of the types in the imported module to our existing types. Needless to say, none if these are especially attractive. Is there a better way to do this? Are any of these options clearly better or worse than the others? Also, when loading named StructTypes, would it be possible for LLVM to reuse an existing type with the same name assuming the existing type is isomorphic? This seems like it would be a win all around. ============================================================================michaelMuller = mmuller at enduden.com | http://www.mindhog.net/~mmuller ----------------------------------------------------------------------------- In this book it is spoken of the Sephiroth, and the Paths, of Spirits and Conjurations; of Gods, Spheres, Planes and many other things which may or may not exist. It is immaterial whether they exist or not. By doing certain things certain results follow. - Aleister Crowley =============================================================================
Hi Michael, since noone of the experts answered, let me share our experiences. We recently had exactly the same problem, I posted on this list on January 31st. I didn't follow Duncans advice to "just use the linker", since for several reasons we wanted to have unique struct types even in the separate modules.> 1) duplicate the LinkModule internal code and copy the module we load from > bitcode to a new module with the correct types mapped.Sounds quite inefficient. If you have to duplicate the internal code anyway, you can also iterate over the original module and mutateType().> 2) duplicate BitcodeReader and create a version that reuses existing > StructTypes.see below.> 3) destructively convert all of the types in the imported module to our > existing types.That's what we actually implemented, following the idea I described in the mentioned post. We don't identify identical struct types by their name, since even in the new type system, names don't actually mean anything. You could just strip them off. Instead, we use the pointer value of the types to identify them, since originally, all our modules reside in the same LLVMContext. Since that doesn't seem to be the case in your situation, you propably would have to use the name, or attach other metadata to uniquely identify your structs.> Also, when loading named StructTypes, would it be possible for LLVM to reuse > an existing type with the same name assuming the existing type is isomorphic? > This seems like it would be a win all around.Just out of interest, I also implemented that, because I thought it could improve the overall performance. But I couldn't measure any performance impact on the simple tests in the test-suite. The main problem is that the named struct could reference other types defined later in the type table of the module, so you can only check whether named structs are identical after the whole type table has been parsed and the types are already created. So during parsing I am remembering which StructTypes had to be renamed, and after that - but before parsing the instructions - I check which of them are isomorphic to the corresponding existing struct, and directly manipulate the type list used when parsing the rest of the module. A funny insight when implementing that is that I also had to change the behaviour of the linker, since it again created copies of all types used in the source module. So after linking, there again were different instances of the same struct type, but only one of them had a meaningful name, since the new copy that the linker creates steals the name of the original type ;) For both implementations I can provide source code if you wish. Cheers, Clemens -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6392 bytes Desc: S/MIME Cryptographic Signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120227/fbb30a57/attachment.bin>
Hi Clemens - thanks for your response. Clemens Hammacher wrote:> Hi Michael, > > since noone of the experts answered, let me share our experiences. We > recently had exactly the same problem, I posted on this list on January > 31st. > I didn't follow Duncans advice to "just use the linker", since for > several reasons we wanted to have unique struct types even in the > separate modules. > > > 1) duplicate the LinkModule internal code and copy the module we load from > > bitcode to a new module with the correct types mapped. > > Sounds quite inefficient. If you have to duplicate the internal code > anyway, you can also iterate over the original module and mutateType(). > > > 2) duplicate BitcodeReader and create a version that reuses existing > > StructTypes. > > see below. > > > 3) destructively convert all of the types in the imported module to our > > existing types. > > That's what we actually implemented, following the idea I described in > the mentioned post. We don't identify identical struct types by their > name, since even in the new type system, names don't actually mean > anything. You could just strip them off. > Instead, we use the pointer value of the types to identify them, since > originally, all our modules reside in the same LLVMContext. Since that > doesn't seem to be the case in your situation, you propably would have > to use the name, or attach other metadata to uniquely identify your structs.I was actually fearful of this approach, it looked to me like the linker was at least partially copying data structures to the destination module. I see that there is a mutateType() method in Value, though it comes with a very stern warning :-) But given your success with it, and given that it seems to involve the least amount of copy-pasting the existing code, I think I'll give it a try.> > > > Also, when loading named StructTypes, would it be possible for LLVM to reuse > > an existing type with the same name assuming the existing type is isomorphic? > > This seems like it would be a win all around. > > Just out of interest, I also implemented that, because I thought it > could improve the overall performance. But I couldn't measure any > performance impact on the simple tests in the test-suite. > The main problem is that the named struct could reference other types > defined later in the type table of the module, so you can only check > whether named structs are identical after the whole type table has been > parsed and the types are already created. So during parsing I am > remembering which StructTypes had to be renamed, and after that - but > before parsing the instructions - I check which of them are isomorphic > to the corresponding existing struct, and directly manipulate the type > list used when parsing the rest of the module.Ah, I see. That would definitely complicate things.> > A funny insight when implementing that is that I also had to change the > behaviour of the linker, since it again created copies of all types used > in the source module. So after linking, there again were different > instances of the same struct type, but only one of them had a meaningful > name, since the new copy that the linker creates steals the name of the > original type ;) > > For both implementations I can provide source code if you wish.Thanks, I think we should be ok given your explanation. If I get stuck, I might take you up on it. Although if we're both doing this, it may be worthwhile for us to try to come up with something general enough to include in LLVM.> > Cheers, > Clemens > >============================================================================michaelMuller = mmuller at enduden.com | http://www.mindhog.net/~mmuller ----------------------------------------------------------------------------- The world is full of security systems. Hack one of them. - Bruce Schneier =============================================================================