I am writing a compiler using LLVM 3.2 to generate native code (currently x86-64) from IR. The native code will be linked by the system linker (not a JIT). The compiler generates calls to a run-time library to perform many operations. Therefore, each Module that I generate needs to be have declarations for all of these run-time functions added to it. Question: is this true? I am assuming that LLVM works like a C++ compiler: before you can call a function from anywhere in a compilation unit, you need its prototype in scope. Initially, I did this by calling Function::Create for each declaration I wanted to make. However, this is starting to "not scale". I also want to experiment with defining some of these library functions using LLVM IR directly. I can then have LLVM inline and optimize calls to these functions. Given that many of the arguments to the functions are constants, there is plenty of opportunity for loop unrolling and optimization. To this end, I would like to read LLVM bitcode into an existing module. The bitcode would contain declarations for all of my library functions, plus definitions for anything I want to try to inline and optimize. ReaderWriter provides an API for loading bitcode and returning a Module as a result. One possibility is for me to read the bitcode into a skeleton module and then have the compiler emit more code into that module. I won't have control over the name of the module if I do this - I'm not sure if that will cause a problem down the road. There also seems to be a mechanism for adding "library dependencies" to a Module. This suggested that perhaps I could read my bitcode into a master library module held off to the side, and have the compiler reference the master module as a library dependency in everything it generated. However, I didn't see easily how the library mechanism worked. What's the most reasonable way for me to declare large numbers of functions into a module? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140720/43ae545a/attachment.html>
Hi David, On 21 July 2014 02:22, David Jones <djones at xtreme-eda.com> wrote:> Question: is this true? I am assuming that LLVM works like a C++ compiler: > before you can call a function from anywhere in a compilation unit, you need > its prototype in scope.Pretty much. There is only one scope for functions in LLVM IR (module-global) but they do have to be declared.> Initially, I did this by calling Function::Create for each declaration I > wanted to make. However, this is starting to "not scale".You don't say quite why it's not scaling, but the functions Module::getOrInsertFunction make some of the details easier. It's how I'd always create a function declaration.> To this end, I would like to read LLVM bitcode into an existing module. The > bitcode would contain declarations for all of my library functions, plus > definitions for anything I want to try to inline and optimize.The easiest way to do this is probably to use the llvm::Linker class. That utility class basically just merges one module's definitions into another, so you just load your library bitcode and link it into the module you actually care about. Alternatively you could just call setModuleIdentifier to rename the loaded library module. I suppose that would be simpler unless you could see yourself splitting the library functions into multiple files to help organisation. Once your library is in the same module as your functions, you can probably simplify the management issue too: declarations will already exist so you can just look them up by name with Module::getOrInsertFunction. You could actually import *just* the declarations like that if you wanted to experiment with just how much benefit came from the inlining at some later date. That is, load a module which looked like just: declare float @sinf(float) declare double @sin(double)> There also seems to be a mechanism for adding "library dependencies" to a ModuleHmm. Not heard of that one. It's the kind of thing multiple languages would find useful so it wouldn't surprise me if it did exist, but I've not encountered it anywhere. Cheers. Tim.
On Mon, Jul 21, 2014 at 9:22 AM, David Jones <djones at xtreme-eda.com> wrote:> I am writing a compiler using LLVM 3.2 to generate native code (currently > x86-64) from IR. The native code will be linked by the system linker (not a > JIT). > > The compiler generates calls to a run-time library to perform many > operations. Therefore, each Module that I generate needs to be have > declarations for all of these run-time functions added to it. > > Question: is this true? I am assuming that LLVM works like a C++ compiler: > before you can call a function from anywhere in a compilation unit, you > need its prototype in scope. >Yes.> > Initially, I did this by calling Function::Create for each declaration I > wanted to make. However, this is starting to "not scale". >I also started out on this path, but once there are a lot of runtime functions it creates a lot of code to gen them, which is what I presume you mean by "not scale".> > I also want to experiment with defining some of these library functions > using LLVM IR directly. I can then have LLVM inline and optimize calls to > these functions. Given that many of the arguments to the functions are > constants, there is plenty of opportunity for loop unrolling and > optimization. > > To this end, I would like to read LLVM bitcode into an existing module. > The bitcode would contain declarations for all of my library functions, > plus definitions for anything I want to try to inline and optimize. >Being unable to find any articles that suggested one method or another, I do this, which works well for what I am doing: I have a runtime library written in C++ and compiled with clang. I have a project module with a single .cpp file, that includes all the relevant headers to pick up code inlined the .h files, and which uses enough runtime methods for the inline code to be generated etc. In practice this is not that much code. I compile that .cpp file to a .bc file runtimeinterface.bc When it comes to code gen time in my compiler, I first load runtimeinterface.bc and compile it: this->codeModule = llvm::ParseIRFile(libFile, ed, this->Context); I then use setModuleIdentifier() to change the module ID appropriately I then locate the 'junk' added by clang, such as the GLOBAL__I constructor stuff and my dummy runtime interface function and perform removeFromParent(). I now have declarations for all the runtime methods, which I need not hard code in to the compiler as well as having the IR for the internal link once inlinable functions in my runtime .h files. Use module->getFunction() to find the function definitions and the approriate other getXXX calls in the Module class. Use assert so that you can detect in debug builds when someone changed the runtime underneath you (you cannot find the function or definition any more) I then start code generation in to this module as if it were an empty one as is the usual case. I have to admit that I just dreamed this method up after casting around with Google quite a bit and I can see a few possible issue with this such as being possibly dependent on the version of clang that is used. This works for me because I am in control of the end use environment however and can ensure that the versions of the components being used are the correct ones. If anyone with greater experience in LLVM than I have sees other issues, then please pipe up! Cheers, Jim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140721/995f7fbd/attachment.html>