thr3ads.net - llvm dev - [LLVMdev] LLVM bytecode simulator/emulator? [Jul 2006]

If this information is useful, please help other people find it:
Share via:

Kenneth Hoste

2006-Jul-14 16:11 UTC

[LLVMdev] LLVM bytecode simulator/emulator?

John Criswell wrote:> Okay.  As Rob has already said, it sounds like you want to write an LLVM 
> pass that adds global variables and instructions to a program.  So, to 
> state it explicitly, you want to:
> 
> 1) Compile the program that you want to instrument to LLVM bytecode 
> using llvm-gcc.
> 2) Use an LLVM pass that you write to instrument the program.
> 3) Use LLVM's llc program to generate C or assembly code of your 
> instrumented program.
> 4) Compile the C/asm code to native code with gcc and link it with any 
> native code libraries that it needs.
> 5) Run the program and gather the information from your instrumentation 
> instructions.
> 
That sounds indeed like what I want to do, but... I want the library 
functions (for example printf), to be counted aswell. I.e., I don't make 
any distinction between program code and library code. This is essential 
for me, because I need the analysis of the program to predict/estimate 
performance, and that's quite hard, if not impossible, without taking 
the library count into account.

Is there a way to statically compile my program (along with the library 
code) into a LLVM bytecode file? That way, I can just instrument that, 
and go on from there with steps 3-5, ignoring the 'link with any native 
code library'. The libraries I need are either pretty standard (i.e. 
glibc), or I have the code for them (so I can compile it along with the 
program).

Maybe it's possible using the lli interpreter (which is a lot slower, I 
know), instead of the analyze tool?
> Since your instrumentation pass will need to add a global variable to 
> the program, a BasicBlockPass is not suitable for what you want to do.  
> I would recommend using a ModulePass.  A ModulePass is given an entire 
> Module (i.e. an entire LLVM bytecode file).  Your pass can add a global 
> counter variable and then iterate over every instruction in the Module, 
> instrumenting it as needed.
Okay, I get that... In ATOM, you're able to iterate over all basic 
blocks, and inside each basic block iterate over all it's instructions, 
and adding instrumentation code after each instruction that way. Hence 
the confusion probably :)

Thanks!

Kenneth

-- 
Statistics are like a bikini. What they reveal is suggestive, but what 
they conceal is vital (Aaron Levenstein)

Kenneth Hoste
ELIS - Ghent University
kenneth.hoste at elis.ugent.be
http://www.elis.ugent.be/~kehoste

Chris Lattner

2006-Jul-14 17:43 UTC

head link

[LLVMdev] LLVM bytecode simulator/emulator?

On Fri, 14 Jul 2006, Kenneth Hoste wrote:> That sounds indeed like what I want to do, but... I want the library 
> functions (for example printf), to be counted aswell. I.e., I don't
make any
> distinction between program code and library code. This is essential for
me,
> because I need the analysis of the program to predict/estimate performance,
> and that's quite hard, if not impossible, without taking the library
count
> into account.
You can certainly do that.  First step is to find source for these 
functions and compile them to LLVM bytecode.  Only once you do that can 
you link the bytecode together.
> Maybe it's possible using the lli interpreter (which is a lot slower, I
> know), instead of the analyze tool?
No, they both have the same scope: all the code compiled to llvm bytecode. 
Neither can inspect native system code.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

John Criswell

2006-Jul-14 18:26 UTC

head link

[LLVMdev] LLVM bytecode simulator/emulator?

Kenneth Hoste wrote:> John Criswell wrote:
>> Okay.  As Rob has already said, it sounds like you want to write an 
>> LLVM pass that adds global variables and instructions to a program.  
>> So, to state it explicitly, you want to:
>>
>> 1) Compile the program that you want to instrument to LLVM bytecode 
>> using llvm-gcc.
>> 2) Use an LLVM pass that you write to instrument the program.
>> 3) Use LLVM's llc program to generate C or assembly code of your 
>> instrumented program.
>> 4) Compile the C/asm code to native code with gcc and link it with 
>> any native code libraries that it needs.
>> 5) Run the program and gather the information from your 
>> instrumentation instructions.
>>
>
> That sounds indeed like what I want to do, but... I want the library 
> functions (for example printf), to be counted aswell. I.e., I don't 
> make any distinction between program code and library code. This is 
> essential for me, because I need the analysis of the program to 
> predict/estimate performance, and that's quite hard, if not 
> impossible, without taking the library count into account.
>
> Is there a way to statically compile my program (along with the 
> library code) into a LLVM bytecode file? That way, I can just 
> instrument that, and go on from there with steps 3-5, ignoring the 
> 'link with any native code library'. The libraries I need are
either
> pretty standard (i.e. glibc), or I have the code for them (so I can 
> compile it along with the program).You can compile library code into LLVM bytecode libraries and link them 
with gccld.  In general, LLVM provides tools equivalent to most of your 
compiler tool chain (gccas, gccld, llvm-ar, llvm-nm, etc).  You can, for 
example, make an archive of LLVM bytecode files.

The problem, in your case, is that no one has successfully compiled all 
of libc into LLVM bytecode yet.  Some libc functions are compiled to 
LLVM bytecode (see llvm/runtime), but many others are not; for those, we 
link against the native libraries after code generation.

You can try compiling the parts of libc you need to LLVM bytecode, but 
be forewarned that it'll be tedious.  C libraries (glibc in particular) 
seem to be designed to make life difficult for people who want to 
compile them.

-- John T.>
>
> Maybe it's possible using the lli interpreter (which is a lot slower, 
> I know), instead of the analyze tool?
>
>> Since your instrumentation pass will need to add a global variable to 
>> the program, a BasicBlockPass is not suitable for what you want to 
>> do.  I would recommend using a ModulePass.  A ModulePass is given an 
>> entire Module (i.e. an entire LLVM bytecode file).  Your pass can add 
>> a global counter variable and then iterate over every instruction in 
>> the Module, instrumenting it as needed.
>
> Okay, I get that... In ATOM, you're able to iterate over all basic 
> blocks, and inside each basic block iterate over all it's 
> instructions, and adding instrumentation code after each instruction 
> that way. Hence the confusion probably :)
>
> Thanks!
>
> Kenneth
>

Kenneth Hoste

2006-Jul-14 19:36 UTC

head link

[LLVMdev] LLVM bytecode simulator/emulator?

John Criswell wrote:
> You can compile library code into LLVM bytecode libraries and link them 
> with gccld.  In general, LLVM provides tools equivalent to most of your 
> compiler tool chain (gccas, gccld, llvm-ar, llvm-nm, etc).  You can, for 
> example, make an archive of LLVM bytecode files.
> 
> The problem, in your case, is that no one has successfully compiled all 
> of libc into LLVM bytecode yet.  Some libc functions are compiled to 
> LLVM bytecode (see llvm/runtime), but many others are not; for those, we 
> link against the native libraries after code generation.
> 
> You can try compiling the parts of libc you need to LLVM bytecode, but 
> be forewarned that it'll be tedious.  C libraries (glibc in particular)
> seem to be designed to make life difficult for people who want to 
> compile them.
I think to amount of glibc functions I need to support will be limited.

We have some compiler experts in our research group, who will be able to 
help me when I experience serious problems (I think).
I think it's worth a shot, atleast until I can really evaluate if the 
effort is worth it or not.

Besides that, I can probably rely on the expierence of some people on 
this mailinglist? What is the best way to start such an ambitious 
project? Do I just check which external functions are called in the 
bytecode I get, and try to support those functions using LLVM code?

Why has no-one supported printf yet? Is it that hard (I have no idea 
really, I'm only asking)?

greetings, and thanks already for your replies

Kenneth

-- 
Statistics are like a bikini. What they reveal is suggestive, but what 
they conceal is vital (Aaron Levenstein)

Kenneth Hoste
ELIS - Ghent University
kenneth.hoste at elis.ugent.be
http://www.elis.ugent.be/~kehoste

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Jul 2006 - [LLVMdev] LLVM bytecode simulator/emulator?

[LLVMdev] LLVM bytecode simulator/emulator?

[LLVMdev] LLVM bytecode simulator/emulator?

[LLVMdev] LLVM bytecode simulator/emulator?

[LLVMdev] LLVM bytecode simulator/emulator?

Possibly Parallel Threads