thr3ads.net - llvm dev - [LLVMdev] mapping types from a bitcode module [Feb 2012]

If this information is useful, please help other people find it:
Share via:

Michael Muller

2012-Feb-24 12:52 UTC

[LLVMdev] mapping types from a bitcode module

Hi all,

We've run into a tricky situation in our work on the Crack compiler and
I'm
hoping that someone on this list can help us find the best solution.

We're currently trying to implement "module caching" for Crack,
similar to the
feature in Python where module bitcode is persisted at compile time.  When we
import a module, before compiling the module we check for a bitcode file
matching the source -- if a bitcode file exists, we load it and extract
compile-time metadata from it, saving us a compile step.

The problem here is that the StructType objects in the cached module are
different from those referenced by the compiler, so when we try to reference
entities in the cached module we get an assertion failure due to type
incompatibility.  Specifically, this is currently happening when we reference
a global variable in the cached module from an array initializer in the
importing module which is still being compiled.

We can't easily just use the linker to manage all of this because we still
want to be able to persist that new module independently for later use.  There
may be other reasons, too: seperate modules is a fundamental assumption of our
design.  We currently use the linker only at the end of an AOT build.

So from what I can see, our possible solutions are:

1) duplicate the LinkModule internal code and copy the module we load from
bitcode to a new module with the correct types mapped.
2) duplicate BitcodeReader and create a version that reuses existing
StructTypes.
3) destructively convert all of the types in the imported module to our
existing types.

Needless to say, none if these are especially attractive.  Is there a better
way to do this?  Are any of these options clearly better or worse than the
others?

Also, when loading named StructTypes, would it be possible for LLVM to reuse
an existing type with the same name assuming the existing type is isomorphic?
This seems like it would be a win all around.

============================================================================michaelMuller
= mmuller at enduden.com | http://www.mindhog.net/~mmuller
-----------------------------------------------------------------------------
In this book it is spoken of the Sephiroth, and the Paths, of Spirits and
Conjurations; of Gods, Spheres, Planes and many other things which may or
may not exist.  It is immaterial whether they exist or not.  By doing
certain things certain results follow. - Aleister Crowley
=============================================================================

Clemens Hammacher

2012-Feb-27 17:44 UTC

head link

[LLVMdev] mapping types from a bitcode module

Hi Michael,

since noone of the experts answered, let me share our experiences. We 
recently had exactly the same problem, I posted on this list on January 
31st.
I didn't follow Duncans advice to "just use the linker", since for
several reasons we wanted to have unique struct types even in the 
separate modules.
> 1) duplicate the LinkModule internal code and copy the module we load from
> bitcode to a new module with the correct types mapped.
Sounds quite inefficient. If you have to duplicate the internal code 
anyway, you can also iterate over the original module and mutateType().
> 2) duplicate BitcodeReader and create a version that reuses existing
> StructTypes.
see below.
> 3) destructively convert all of the types in the imported module to our
> existing types.
That's what we actually implemented, following the idea I described in 
the mentioned post. We don't identify identical struct types by their 
name, since even in the new type system, names don't actually mean 
anything. You could just strip them off.
Instead, we use the pointer value of the types to identify them, since 
originally, all our modules reside in the same LLVMContext. Since that 
doesn't seem to be the case in your situation, you propably would have 
to use the name, or attach other metadata to uniquely identify your structs.

> Also, when loading named StructTypes, would it be possible for LLVM to
reuse
> an existing type with the same name assuming the existing type is
isomorphic?
> This seems like it would be a win all around.
Just out of interest, I also implemented that, because I thought it 
could improve the overall performance. But I couldn't measure any 
performance impact on the simple tests in the test-suite.
The main problem is that the named struct could reference other types 
defined later in the type table of the module, so you can only check 
whether named structs are identical after the whole type table has been 
parsed and the types are already created. So during parsing I am 
remembering which StructTypes had to be renamed, and after that - but 
before parsing the instructions - I check which of them are isomorphic 
to the corresponding existing struct, and directly manipulate the type 
list used when parsing the rest of the module.

A funny insight when implementing that is that I also had to change the 
behaviour of the linker, since it again created copies of all types used 
in the source module. So after linking, there again were different 
instances of the same struct type, but only one of them had a meaningful 
name, since the new copy that the linker creates steals the name of the 
original type ;)

For both implementations I can provide source code if you wish.

Cheers,
Clemens

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6392 bytes
Desc: S/MIME Cryptographic Signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120227/fbb30a57/attachment.bin>

Michael Muller

2012-Feb-27 19:27 UTC

head link

[LLVMdev] mapping types from a bitcode module

Hi Clemens - thanks for your response.

Clemens Hammacher wrote:> Hi Michael,
> 
> since noone of the experts answered, let me share our experiences. We 
> recently had exactly the same problem, I posted on this list on January 
> 31st.
> I didn't follow Duncans advice to "just use the linker",
since for
> several reasons we wanted to have unique struct types even in the 
> separate modules.
> 
> > 1) duplicate the LinkModule internal code and copy the module we load
from
> > bitcode to a new module with the correct types mapped.
> 
> Sounds quite inefficient. If you have to duplicate the internal code 
> anyway, you can also iterate over the original module and mutateType().
> 
> > 2) duplicate BitcodeReader and create a version that reuses existing
> > StructTypes.
> 
> see below.
> 
> > 3) destructively convert all of the types in the imported module to
our
> > existing types.
> 
> That's what we actually implemented, following the idea I described in 
> the mentioned post. We don't identify identical struct types by their 
> name, since even in the new type system, names don't actually mean 
> anything. You could just strip them off.
> Instead, we use the pointer value of the types to identify them, since 
> originally, all our modules reside in the same LLVMContext. Since that 
> doesn't seem to be the case in your situation, you propably would have 
> to use the name, or attach other metadata to uniquely identify your
structs.
I was actually fearful of this approach, it looked to me like the linker was
at least partially copying data structures to the destination module.  I see
that there is a mutateType() method in Value, though it comes with a very
stern warning :-)

But given your success with it, and given that it seems to involve the least
amount of copy-pasting the existing code, I think I'll give it a try.
> 
> 
> > Also, when loading named StructTypes, would it be possible for LLVM to
reuse
> > an existing type with the same name assuming the existing type is
isomorphic?
> > This seems like it would be a win all around.
> 
> Just out of interest, I also implemented that, because I thought it 
> could improve the overall performance. But I couldn't measure any 
> performance impact on the simple tests in the test-suite.
> The main problem is that the named struct could reference other types 
> defined later in the type table of the module, so you can only check 
> whether named structs are identical after the whole type table has been 
> parsed and the types are already created. So during parsing I am 
> remembering which StructTypes had to be renamed, and after that - but 
> before parsing the instructions - I check which of them are isomorphic 
> to the corresponding existing struct, and directly manipulate the type 
> list used when parsing the rest of the module.
Ah, I see.  That would definitely complicate things.
> 
> A funny insight when implementing that is that I also had to change the 
> behaviour of the linker, since it again created copies of all types used 
> in the source module. So after linking, there again were different 
> instances of the same struct type, but only one of them had a meaningful 
> name, since the new copy that the linker creates steals the name of the 
> original type ;)
> 
> For both implementations I can provide source code if you wish.
Thanks, I think we should be ok given your explanation.  If I get stuck,
I might take you up on it.  Although if we're both doing this, it may be
worthwhile for us to try to come up with something general enough to include
in LLVM.
> 
> Cheers,
> Clemens
> 
> 

============================================================================michaelMuller
= mmuller at enduden.com | http://www.mindhog.net/~mmuller
-----------------------------------------------------------------------------
The world is full of security systems.  Hack one of them. - Bruce Schneier
=============================================================================

Seemingly Similar Threads

Search for more seemingly similar threads

llvm dev - Feb 2012 - [LLVMdev] mapping types from a bitcode module

[LLVMdev] mapping types from a bitcode module

[LLVMdev] mapping types from a bitcode module

[LLVMdev] mapping types from a bitcode module

Seemingly Similar Threads