koffie drinker via llvm-dev
2016-Jul-07 09:52 UTC
[llvm-dev] ObjectCache and getFunctionAddress issue
Hi all, I'm trying to add pre-compiled object cache to my run-time. I've implemented the object cache as follow: class EngineObjectCache : public llvm::ObjectCache { private: std::unordered_map<std::string, std::unique_ptr<llvm::MemoryBuffer>> CachedObjs; public: virtual void notifyObjectCompiled(const llvm::Module *M, llvm::MemoryBufferRef Obj) { auto id = M->getModuleIdentifier(); auto iter = CachedObjs.find(id); if (iter == CachedObjs.end()) { auto buf = llvm::MemoryBuffer::getMemBufferCopy(Obj.getBuffer(), Obj.getBufferIdentifier()); CachedObjs.insert(std::make_pair(id, std::move(buf))); } }; virtual std::unique_ptr<llvm::MemoryBuffer> getObject(const llvm::Module *M) { auto id = M->getModuleIdentifier(); auto iter = CachedObjs.find(id); if (iter != CachedObjs.end()) { llvm::MemoryBuffer& B = *iter->second; return llvm::MemoryBuffer::getMemBufferCopy(B.getBuffer(), B.getBufferIdentifier()); } else return nullptr; }; } When I generate the code for the first time, everything works fine. the objects in CachedObjs are dumped to disk and reloaded for the next run. However with the next run (thus using the cached objects) the executionengine->getFunctionAddress() *sometimes *returns nullptr for a function that exists in the cache. I've traced into the cache call, and could see that my getObject() returned the right object. The object is loaded and no error was reported by ( MCJIT::generateCodeForModule(Module *M)): // Load the object into the dynamic linker. // MCJIT now owns the ObjectImage pointer (via its LoadedObjects list). ErrorOr<std::unique_ptr<object::ObjectFile>> LoadedObject object::ObjectFile::createObjectFile(ObjectToLoad->getMemBufferRef()); std::unique_ptr<RuntimeDyld::LoadedObjectInfo> L Dyld.loadObject(*LoadedObject.get()); if (Dyld.hasError()) report_fatal_error(Dyld.getErrorString()); after the generateCodeForModule call, the findExistingSymbol is invoked. For some reason it cannot find my symbol. My symbol exists in the Module that was used as input for generateCodeForModule. It uses the object cache and that code was valid and generated in the previous run. I tried to traced further in RuntimeDyld::SymbolInfo getSymbol() and could see that the GlobalSymbolTable did not contained my symbol. I obviously forgot something or did something wrong, what puzzeld me is that is quite semi-random. Even the most simple statements such as int x 1; (no dependencies) sometime fails. I did not hardcode any ptrs in the IR so the objects should be reusable over multiple instances. Any one got a clue ? Cheers, -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160707/2f54d8bf/attachment.html>
Lang Hames via llvm-dev
2016-Jul-07 21:15 UTC
[llvm-dev] ObjectCache and getFunctionAddress issue
Hi Koffie, Your cache class looks good to me. Can you provide a test case? Have you added an IR Module for each cached object before finalizing anything? The way MCJIT is written at the moment a Module needs to be present with a definition of each function before the cache is searched (not ideal - we should fix this, but it's how things are at the moment). So one way that this could fail is if you start with two modules A and B, where A depends on B: A.ll: declare void @bar define void @foo ... { call void @bar ret void } B.ll: define void @bar ... { ret void } If you launch with a cache already containing code for these modules, but only add A.ll (but not B.ll) to the JIT on the second run, the following could happen: (1) When searching for @foo, MCJIT::findSymbol (where all the real work for getFunctionAddress is done) won't find anything via findExistingSymbol, so it'll call findModuleForSymbol '@foo', which will return A.ll (2) MCJIT::findSymbol will do a cache lookup for A.ll and find your stored A.o (3) When you finalize A.o, RuntimeDyld will need the address for @bar, so it'll call (via a few levels of indirection) back to MCJIT::findSymbol again. MCJIT::findSymbol won't find @bar via findExistingSymbol, so it'll call findModuleForSymbol '@bar', but there is no module containing '@bar' (even though there's an object containing it in the cache), so you'll get a nullptr back representing "symbol not found". This wouldn't explain the intermittent nature of the failure on its own, but if you're testing by hand in a REPL then success vs. failure will probably depend on will depend on whether you've evaluated '@bar' before '@foo' on any given test run. Hope this helps, if not let me know and we'll dig in further. - Lang. On Thu, Jul 7, 2016 at 2:52 AM, koffie drinker via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi all, > > I'm trying to add pre-compiled object cache to my run-time. > I've implemented the object cache as follow: > > class EngineObjectCache : public llvm::ObjectCache { > private: > std::unordered_map<std::string, std::unique_ptr<llvm::MemoryBuffer>> > CachedObjs; > > public: > virtual void notifyObjectCompiled(const llvm::Module *M, > llvm::MemoryBufferRef Obj) { > auto id = M->getModuleIdentifier(); > auto iter = CachedObjs.find(id); > if (iter == CachedObjs.end()) { > auto buf = llvm::MemoryBuffer::getMemBufferCopy(Obj.getBuffer(), > Obj.getBufferIdentifier()); > CachedObjs.insert(std::make_pair(id, std::move(buf))); > } > }; > virtual std::unique_ptr<llvm::MemoryBuffer> getObject(const llvm::Module > *M) { > auto id = M->getModuleIdentifier(); > auto iter = CachedObjs.find(id); > if (iter != CachedObjs.end()) { > llvm::MemoryBuffer& B = *iter->second; > return llvm::MemoryBuffer::getMemBufferCopy(B.getBuffer(), > B.getBufferIdentifier()); > } > else > return nullptr; > }; > } > > When I generate the code for the first time, everything works fine. the > objects in CachedObjs are dumped to disk and reloaded for the next run. > However with the next run (thus using the cached objects) the > executionengine->getFunctionAddress() *sometimes *returns nullptr for a > function that exists in the cache. I've traced into the cache call, and > could see that my getObject() returned the right object. The object is > loaded and no error was reported by ( MCJIT::generateCodeForModule(Module > *M)): > > // Load the object into the dynamic linker. > // MCJIT now owns the ObjectImage pointer (via its LoadedObjects list). > ErrorOr<std::unique_ptr<object::ObjectFile>> LoadedObject > object::ObjectFile::createObjectFile(ObjectToLoad->getMemBufferRef()); > std::unique_ptr<RuntimeDyld::LoadedObjectInfo> L > Dyld.loadObject(*LoadedObject.get()); > > if (Dyld.hasError()) > report_fatal_error(Dyld.getErrorString()); > > > after the generateCodeForModule call, the findExistingSymbol is invoked. > For some reason it cannot find my symbol. My symbol exists in the Module > that was used as input for generateCodeForModule. It uses the object cache > and that code was valid and generated in the previous run. > > I tried to traced further in RuntimeDyld::SymbolInfo getSymbol() > and could see that the GlobalSymbolTable did not contained my symbol. > > I obviously forgot something or did something wrong, what puzzeld me is > that is quite semi-random. Even the most simple statements such as int x > 1; (no dependencies) sometime fails. > > I did not hardcode any ptrs in the IR so the objects should be reusable > over multiple instances. > Any one got a clue ? > > Cheers, > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160707/bdefabd4/attachment.html>
koffie drinker via llvm-dev
2016-Jul-07 22:05 UTC
[llvm-dev] ObjectCache and getFunctionAddress issue
Hi Lang, Thanks for the clarification. I solved the dependency by construction a function dependency graph, and sort it. This way, I always process items in the right order. I did some more debugging, and found the cause of my issue. I had an incorrect function name <-> cached object mapping. So when I try to resolve the function name, it could not find it in the processed module. All seems fine now, need to do more testing though. On Thu, Jul 7, 2016 at 11:15 PM, Lang Hames <lhames at gmail.com> wrote:> Hi Koffie, > > Your cache class looks good to me. Can you provide a test case? > > Have you added an IR Module for each cached object before finalizing > anything? The way MCJIT is written at the moment a Module needs to be > present with a definition of each function before the cache is searched > (not ideal - we should fix this, but it's how things are at the moment). > > So one way that this could fail is if you start with two modules A and B, > where A depends on B: > > A.ll: > > declare void @bar > define void @foo ... { > call void @bar > ret void > } > > B.ll: > > define void @bar ... { > ret void > } > > If you launch with a cache already containing code for these modules, but > only add A.ll (but not B.ll) to the JIT on the second run, the following > could happen: > > (1) When searching for @foo, MCJIT::findSymbol (where all the real work > for getFunctionAddress is done) won't find anything via findExistingSymbol, > so it'll call findModuleForSymbol '@foo', which will return A.ll > (2) MCJIT::findSymbol will do a cache lookup for A.ll and find your stored > A.o > (3) When you finalize A.o, RuntimeDyld will need the address for @bar, so > it'll call (via a few levels of indirection) back to MCJIT::findSymbol > again. MCJIT::findSymbol won't find @bar via findExistingSymbol, so it'll > call findModuleForSymbol '@bar', but there is no module containing '@bar' > (even though there's an object containing it in the cache), so you'll get a > nullptr back representing "symbol not found". > > This wouldn't explain the intermittent nature of the failure on its own, > but if you're testing by hand in a REPL then success vs. failure will > probably depend on will depend on whether you've evaluated '@bar' before > '@foo' on any given test run. > > Hope this helps, if not let me know and we'll dig in further. > > - Lang. > > > > > On Thu, Jul 7, 2016 at 2:52 AM, koffie drinker via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi all, >> >> I'm trying to add pre-compiled object cache to my run-time. >> I've implemented the object cache as follow: >> >> class EngineObjectCache : public llvm::ObjectCache { >> private: >> std::unordered_map<std::string, std::unique_ptr<llvm::MemoryBuffer>> >> CachedObjs; >> >> public: >> virtual void notifyObjectCompiled(const llvm::Module *M, >> llvm::MemoryBufferRef Obj) { >> auto id = M->getModuleIdentifier(); >> auto iter = CachedObjs.find(id); >> if (iter == CachedObjs.end()) { >> auto buf = llvm::MemoryBuffer::getMemBufferCopy(Obj.getBuffer(), >> Obj.getBufferIdentifier()); >> CachedObjs.insert(std::make_pair(id, std::move(buf))); >> } >> }; >> virtual std::unique_ptr<llvm::MemoryBuffer> getObject(const llvm::Module >> *M) { >> auto id = M->getModuleIdentifier(); >> auto iter = CachedObjs.find(id); >> if (iter != CachedObjs.end()) { >> llvm::MemoryBuffer& B = *iter->second; >> return llvm::MemoryBuffer::getMemBufferCopy(B.getBuffer(), >> B.getBufferIdentifier()); >> } >> else >> return nullptr; >> }; >> } >> >> When I generate the code for the first time, everything works fine. the >> objects in CachedObjs are dumped to disk and reloaded for the next run. >> However with the next run (thus using the cached objects) the >> executionengine->getFunctionAddress() *sometimes *returns nullptr for a >> function that exists in the cache. I've traced into the cache call, and >> could see that my getObject() returned the right object. The object is >> loaded and no error was reported by ( MCJIT::generateCodeForModule(Module >> *M)): >> >> // Load the object into the dynamic linker. >> // MCJIT now owns the ObjectImage pointer (via its LoadedObjects list). >> ErrorOr<std::unique_ptr<object::ObjectFile>> LoadedObject >> object::ObjectFile::createObjectFile(ObjectToLoad->getMemBufferRef()); >> std::unique_ptr<RuntimeDyld::LoadedObjectInfo> L >> Dyld.loadObject(*LoadedObject.get()); >> >> if (Dyld.hasError()) >> report_fatal_error(Dyld.getErrorString()); >> >> >> after the generateCodeForModule call, the findExistingSymbol is invoked. >> For some reason it cannot find my symbol. My symbol exists in the Module >> that was used as input for generateCodeForModule. It uses the object cache >> and that code was valid and generated in the previous run. >> >> I tried to traced further in RuntimeDyld::SymbolInfo getSymbol() >> and could see that the GlobalSymbolTable did not contained my symbol. >> >> I obviously forgot something or did something wrong, what puzzeld me is >> that is quite semi-random. Even the most simple statements such as int x >> 1; (no dependencies) sometime fails. >> >> I did not hardcode any ptrs in the IR so the objects should be reusable >> over multiple instances. >> Any one got a clue ? >> >> Cheers, >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160708/45d1dca3/attachment.html>