John Criswell wrote:> Okay. As Rob has already said, it sounds like you want to write an LLVM > pass that adds global variables and instructions to a program. So, to > state it explicitly, you want to: > > 1) Compile the program that you want to instrument to LLVM bytecode > using llvm-gcc. > 2) Use an LLVM pass that you write to instrument the program. > 3) Use LLVM's llc program to generate C or assembly code of your > instrumented program. > 4) Compile the C/asm code to native code with gcc and link it with any > native code libraries that it needs. > 5) Run the program and gather the information from your instrumentation > instructions. >That sounds indeed like what I want to do, but... I want the library functions (for example printf), to be counted aswell. I.e., I don't make any distinction between program code and library code. This is essential for me, because I need the analysis of the program to predict/estimate performance, and that's quite hard, if not impossible, without taking the library count into account. Is there a way to statically compile my program (along with the library code) into a LLVM bytecode file? That way, I can just instrument that, and go on from there with steps 3-5, ignoring the 'link with any native code library'. The libraries I need are either pretty standard (i.e. glibc), or I have the code for them (so I can compile it along with the program). Maybe it's possible using the lli interpreter (which is a lot slower, I know), instead of the analyze tool?> Since your instrumentation pass will need to add a global variable to > the program, a BasicBlockPass is not suitable for what you want to do. > I would recommend using a ModulePass. A ModulePass is given an entire > Module (i.e. an entire LLVM bytecode file). Your pass can add a global > counter variable and then iterate over every instruction in the Module, > instrumenting it as needed.Okay, I get that... In ATOM, you're able to iterate over all basic blocks, and inside each basic block iterate over all it's instructions, and adding instrumentation code after each instruction that way. Hence the confusion probably :) Thanks! Kenneth -- Statistics are like a bikini. What they reveal is suggestive, but what they conceal is vital (Aaron Levenstein) Kenneth Hoste ELIS - Ghent University kenneth.hoste at elis.ugent.be http://www.elis.ugent.be/~kehoste
On Fri, 14 Jul 2006, Kenneth Hoste wrote:> That sounds indeed like what I want to do, but... I want the library > functions (for example printf), to be counted aswell. I.e., I don't make any > distinction between program code and library code. This is essential for me, > because I need the analysis of the program to predict/estimate performance, > and that's quite hard, if not impossible, without taking the library count > into account.You can certainly do that. First step is to find source for these functions and compile them to LLVM bytecode. Only once you do that can you link the bytecode together.> Maybe it's possible using the lli interpreter (which is a lot slower, I > know), instead of the analyze tool?No, they both have the same scope: all the code compiled to llvm bytecode. Neither can inspect native system code. -Chris -- http://nondot.org/sabre/ http://llvm.org/
Kenneth Hoste wrote:> John Criswell wrote: >> Okay. As Rob has already said, it sounds like you want to write an >> LLVM pass that adds global variables and instructions to a program. >> So, to state it explicitly, you want to: >> >> 1) Compile the program that you want to instrument to LLVM bytecode >> using llvm-gcc. >> 2) Use an LLVM pass that you write to instrument the program. >> 3) Use LLVM's llc program to generate C or assembly code of your >> instrumented program. >> 4) Compile the C/asm code to native code with gcc and link it with >> any native code libraries that it needs. >> 5) Run the program and gather the information from your >> instrumentation instructions. >> > > That sounds indeed like what I want to do, but... I want the library > functions (for example printf), to be counted aswell. I.e., I don't > make any distinction between program code and library code. This is > essential for me, because I need the analysis of the program to > predict/estimate performance, and that's quite hard, if not > impossible, without taking the library count into account. > > Is there a way to statically compile my program (along with the > library code) into a LLVM bytecode file? That way, I can just > instrument that, and go on from there with steps 3-5, ignoring the > 'link with any native code library'. The libraries I need are either > pretty standard (i.e. glibc), or I have the code for them (so I can > compile it along with the program).You can compile library code into LLVM bytecode libraries and link them with gccld. In general, LLVM provides tools equivalent to most of your compiler tool chain (gccas, gccld, llvm-ar, llvm-nm, etc). You can, for example, make an archive of LLVM bytecode files. The problem, in your case, is that no one has successfully compiled all of libc into LLVM bytecode yet. Some libc functions are compiled to LLVM bytecode (see llvm/runtime), but many others are not; for those, we link against the native libraries after code generation. You can try compiling the parts of libc you need to LLVM bytecode, but be forewarned that it'll be tedious. C libraries (glibc in particular) seem to be designed to make life difficult for people who want to compile them. -- John T.> > > Maybe it's possible using the lli interpreter (which is a lot slower, > I know), instead of the analyze tool? > >> Since your instrumentation pass will need to add a global variable to >> the program, a BasicBlockPass is not suitable for what you want to >> do. I would recommend using a ModulePass. A ModulePass is given an >> entire Module (i.e. an entire LLVM bytecode file). Your pass can add >> a global counter variable and then iterate over every instruction in >> the Module, instrumenting it as needed. > > Okay, I get that... In ATOM, you're able to iterate over all basic > blocks, and inside each basic block iterate over all it's > instructions, and adding instrumentation code after each instruction > that way. Hence the confusion probably :) > > Thanks! > > Kenneth >
John Criswell wrote:> You can compile library code into LLVM bytecode libraries and link them > with gccld. In general, LLVM provides tools equivalent to most of your > compiler tool chain (gccas, gccld, llvm-ar, llvm-nm, etc). You can, for > example, make an archive of LLVM bytecode files. > > The problem, in your case, is that no one has successfully compiled all > of libc into LLVM bytecode yet. Some libc functions are compiled to > LLVM bytecode (see llvm/runtime), but many others are not; for those, we > link against the native libraries after code generation. > > You can try compiling the parts of libc you need to LLVM bytecode, but > be forewarned that it'll be tedious. C libraries (glibc in particular) > seem to be designed to make life difficult for people who want to > compile them.I think to amount of glibc functions I need to support will be limited. We have some compiler experts in our research group, who will be able to help me when I experience serious problems (I think). I think it's worth a shot, atleast until I can really evaluate if the effort is worth it or not. Besides that, I can probably rely on the expierence of some people on this mailinglist? What is the best way to start such an ambitious project? Do I just check which external functions are called in the bytecode I get, and try to support those functions using LLVM code? Why has no-one supported printf yet? Is it that hard (I have no idea really, I'm only asking)? greetings, and thanks already for your replies Kenneth -- Statistics are like a bikini. What they reveal is suggestive, but what they conceal is vital (Aaron Levenstein) Kenneth Hoste ELIS - Ghent University kenneth.hoste at elis.ugent.be http://www.elis.ugent.be/~kehoste