I'm having a problem with MCJIT (in LLVM 3.3 and 3.4), in which it's not resolving symbol mangling in a precompiled bitcode in the same way as old JIT. It's possible that it's just my misunderstanding. Maybe somebody can spot my problem, or identify it as an MCJIT bug. Here's my situation, in a nutshell: * I am assembling IR and JITing in my app. The IR may potentially make calls to a large body of code that I precompile to bitcode using "clang++ -S --emit-llvm", then create a .cpp file containing the bitcode, which is compiled into my app. * Before JITing the dynamic code, my app initializes the Module like this: llvm::MemoryBuffer* buf llvm::MemoryBuffer::getMemBuffer (llvm::StringRef(bitcode, bitcode_size), name); llvm::Module *m = llvm::getLazyBitcodeModule (buf, context(), err); where bitcode is a big char array holding the precompiled bitcode. The idea is to "seed" the module with that precompiled bitcode so that any calls I inserted into the IR will work properly. * When I JIT, I just refer to functions in the bitcode like "foo", if that's what I called it in the original .cpp file that was turned into bitcode. * Traditionally, I have created a JIT execution engine like this: m_llvm_exec = llvm::ExecutionEngine::createJIT (module(), err, jitmm(), llvm::CodeGenOpt::Default, /*AllocateGVsWithCode*/ false); All has worked fine, this is a system that's seen heavy production use for a couple years now. Now I'm trying to make this codebase work with MCJIT, and I've run into some trouble. Here's how I'm setting up the ExecutionEngine for the MCJIT case: m_llvm_exec = llvm::EngineBuilder(module()) .setEngineKind(llvm::EngineKind::JIT) .setErrorStr(err) .setJITMemoryManager(jitmm()) .setOptLevel(llvm::CodeGenOpt::Default) .setUseMCJIT(USE_MCJIT) .create(); USE_MCJIT is 1 when I'm building the code to use MCJIT. I'm initializing the buffer and seeding it with the precompiled bitcode in the same way as always, as outlined above. The basic problem is that it's not finding the symbols in that bitcode.. I get an error message back like this: Program used external function '_foo' which could not be resolved! So it seems that it's an issue of whether or not the underscore prefix is included when looking up the function from the module, and old JIT and MCJIT disagree. Furthermore, if I change the creation of the module from using llvm::getLazyBitcodeModule to this: llvm::Module *m = llvm::ParseBitcodeFile (buf, context(), err); it works just fine. But of course, I'd really like to deserialize this bitcode file lazily, because it's got a ton of functions potentially called by my IR, but any given bit of code that I'm JITing only uses a tiny subset, so the JIT speed has greatly reduced overhead (10-20x!) when using the lazy option, so that's considered fairly critical for our app. So, in short: old JIT + ParseBitcodeFile = works old JIT + getLazyBitcodeModule = works MCJIT + ParseBitcodeFile = works MCJIT + getLazyBitcodeModule = BROKEN Does anybody have advice? Thanks in advance for any help. -- Larry Gritz lg at larrygritz.com
Hi Larry, I'm pretty sure MCJIT won't do what you need without some changes to the way you're doing things. When MCJIT compiles a Module, it compiles the entire Module and tries to resolve any and all undefined symbols. I'm not familiar with getLazyBitcodeModule, but at a glance (and cross referencing your comments below) it seems that it tries to add GlobalValues to a Module as they are needed. MCJIT doesn't let you modify Modules once it has compiled them, so that's not going to work. Even if we built some scheme into MCJIT to materialize things before it compiled a Module it would end up materializing everything, so that wouldn't help you. You have a few options. 1. You can continue to load the pre-existing bitcode with getLazyBitcodeModule then emit your dynamic code into a separate Module which gets linked against the "lazy" Module before it is handed off to MCJIT. 2. You can use MCJIT's object caching mechanism to load a fully pre-compiled version of your bitcode. Again you'd need to have your dynamic code in a separate Module, but in this case MCJIT would take care of the linking. If you know the target architecture ahead of time you can install the cached object with your application. If not, you'd need to take the large compilation hit once. After that it should be fairly fast. The downside is that you'd potentially have a lot more code loaded into memory than you needed. 3. You can break the pre-compiled code into smaller chunks and compile them into an archive file. MCJIT recently added the ability to link against archive files. This would give you control over the granularity at which pieces of your pre-compiled code get loaded while also giving you the speed of the cached object file solution. The trade-off is that for this solution you do need to know the target architecture ahead of time. Hope this helps. -Andy -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Larry Gritz Sent: Monday, January 20, 2014 11:29 AM To: LLVM Developers Mailing List Subject: [LLVMdev] MCJIT versus getLazyBitcodeModule? I'm having a problem with MCJIT (in LLVM 3.3 and 3.4), in which it's not resolving symbol mangling in a precompiled bitcode in the same way as old JIT. It's possible that it's just my misunderstanding. Maybe somebody can spot my problem, or identify it as an MCJIT bug. Here's my situation, in a nutshell: * I am assembling IR and JITing in my app. The IR may potentially make calls to a large body of code that I precompile to bitcode using "clang++ -S --emit-llvm", then create a .cpp file containing the bitcode, which is compiled into my app. * Before JITing the dynamic code, my app initializes the Module like this: llvm::MemoryBuffer* buf llvm::MemoryBuffer::getMemBuffer (llvm::StringRef(bitcode, bitcode_size), name); llvm::Module *m = llvm::getLazyBitcodeModule (buf, context(), err); where bitcode is a big char array holding the precompiled bitcode. The idea is to "seed" the module with that precompiled bitcode so that any calls I inserted into the IR will work properly. * When I JIT, I just refer to functions in the bitcode like "foo", if that's what I called it in the original .cpp file that was turned into bitcode. * Traditionally, I have created a JIT execution engine like this: m_llvm_exec = llvm::ExecutionEngine::createJIT (module(), err, jitmm(), llvm::CodeGenOpt::Default, /*AllocateGVsWithCode*/ false); All has worked fine, this is a system that's seen heavy production use for a couple years now. Now I'm trying to make this codebase work with MCJIT, and I've run into some trouble. Here's how I'm setting up the ExecutionEngine for the MCJIT case: m_llvm_exec = llvm::EngineBuilder(module()) .setEngineKind(llvm::EngineKind::JIT) .setErrorStr(err) .setJITMemoryManager(jitmm()) .setOptLevel(llvm::CodeGenOpt::Default) .setUseMCJIT(USE_MCJIT) .create(); USE_MCJIT is 1 when I'm building the code to use MCJIT. I'm initializing the buffer and seeding it with the precompiled bitcode in the same way as always, as outlined above. The basic problem is that it's not finding the symbols in that bitcode.. I get an error message back like this: Program used external function '_foo' which could not be resolved! So it seems that it's an issue of whether or not the underscore prefix is included when looking up the function from the module, and old JIT and MCJIT disagree. Furthermore, if I change the creation of the module from using llvm::getLazyBitcodeModule to this: llvm::Module *m = llvm::ParseBitcodeFile (buf, context(), err); it works just fine. But of course, I'd really like to deserialize this bitcode file lazily, because it's got a ton of functions potentially called by my IR, but any given bit of code that I'm JITing only uses a tiny subset, so the JIT speed has greatly reduced overhead (10-20x!) when using the lazy option, so that's considered fairly critical for our app. So, in short: old JIT + ParseBitcodeFile = works old JIT + getLazyBitcodeModule = works MCJIT + ParseBitcodeFile = works MCJIT + getLazyBitcodeModule = BROKEN Does anybody have advice? Thanks in advance for any help. -- Larry Gritz lg at larrygritz.com _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
This is sounding rather like getLazyBitcodeModule is simply incompatible with MCJIT. Can anybody confirm that this is definitely the case? Is it by design, or by omission, or bug? Re your option #1 and #2 -- sorry for the newbie questions, but can you point me to docs or code examples for how the linking or object caching should be achieved? If I do either of these rather than seeding my bitcode into the same module where I'm dynamically assembling my IR, does that mean that it will be unable to inline those functions? It's possible that my best option may be to give up on getLazyBitcodeModule when using MCJIT, reverting back to ParseBitcodeFile, but try to lower compilation overhead by carefully dividing my module into just the parts that get the most bang-for-buck with inlining (keep those in the bitcode, but much less to compile), and move the rest into my app and have it resolve the symbols but not be able to inline. -- lg On Jan 21, 2014, at 9:51 AM, Kaylor, Andrew <andrew.kaylor at intel.com> wrote:> Hi Larry, > > I'm pretty sure MCJIT won't do what you need without some changes to the way you're doing things. > > When MCJIT compiles a Module, it compiles the entire Module and tries to resolve any and all undefined symbols. I'm not familiar with getLazyBitcodeModule, but at a glance (and cross referencing your comments below) it seems that it tries to add GlobalValues to a Module as they are needed. MCJIT doesn't let you modify Modules once it has compiled them, so that's not going to work. Even if we built some scheme into MCJIT to materialize things before it compiled a Module it would end up materializing everything, so that wouldn't help you. > > You have a few options. > > 1. You can continue to load the pre-existing bitcode with getLazyBitcodeModule then emit your dynamic code into a separate Module which gets linked against the "lazy" Module before it is handed off to MCJIT. > > 2. You can use MCJIT's object caching mechanism to load a fully pre-compiled version of your bitcode. Again you'd need to have your dynamic code in a separate Module, but in this case MCJIT would take care of the linking. If you know the target architecture ahead of time you can install the cached object with your application. If not, you'd need to take the large compilation hit once. After that it should be fairly fast. The downside is that you'd potentially have a lot more code loaded into memory than you needed. > > 3. You can break the pre-compiled code into smaller chunks and compile them into an archive file. MCJIT recently added the ability to link against archive files. This would give you control over the granularity at which pieces of your pre-compiled code get loaded while also giving you the speed of the cached object file solution. The trade-off is that for this solution you do need to know the target architecture ahead of time. > > Hope this helps. > > -Andy >-- Larry Gritz lg at larrygritz.com