thr3ads.net - llvm dev - [LLVMdev] LLVM bytecode simulator/emulator? [Jul 2006]

If this information is useful, please help other people find it:
Share via:

Kenneth Hoste

2006-Jul-14 05:26 UTC

[LLVMdev] LLVM bytecode simulator/emulator?

Chris Lattner wrote:> On Thu, 13 Jul 2006, Kenneth Hoste wrote:
>> After browsing through the docs, at a first glance I think I should 
>> write a plugin for the 'analyze' tool. I think 
>> http://llvm.org/docs/WritingAnLLVMPass.html is where I should start
from.
>> The only problem I see now is that there doesn't seem to be a way
to
>> get information on a single instruction while being able to keep state 
>> over all instructions... Is that possible, and if it is, can oyu tell 
>> me how (or where I can find an example of it?).
> 
> I don't really understand what you mean by this, but a ModulePass is 
> fully general, it can do anything.  Can you explain what you're trying 
> to do?
The way we're characterizing programs now, is adding our own code after 
every instruction. When that instruction gets executed (which can happen 
several times, for example inside a loop), we update our state. A simple 
example is counting the number of dynamic instructions executed, or the 
instruction mix (% loads, % stores, ...) in the dynamic execution.
If I was able to do that using LLVM, we would have characteristics on a 
higher level of abstraction. The documentation on the BasicBlock pass 
mentions not to keep state over different basic blocks, but I do want 
that. Also, I need a way to iterate over the _dynamic_ instruction 
stream. Is there a way to do that?

Example static vs dynamic:

static:
L: add x, y
    sub y, z
    jmpif z>100
    mul x, z

dynamic:
add x, y
sub y, z
jmpif z>100
add x, y
sub y, z
jmpif z>100
...
jmpif z>100
mul x, z


If my problem still isn't clear, it's because I didn't explain it
well.
Feel free to ask further questions. I'll look into the 
documentation/examples today, to see if I can find some kind of dynamic 
analysis.

greetings,

Kennneth


-- 
Statistics are like a bikini. What they reveal is suggestive, but what 
they conceal is vital (Aaron Levenstein)

Kenneth Hoste
ELIS - Ghent University
kenneth.hoste at elis.ugent.be
http://www.elis.ugent.be/~kehoste

Robert L. Bocchino Jr.

2006-Jul-14 14:08 UTC

head link

[LLVMdev] LLVM bytecode simulator/emulator?

Hi,

I have done something like this.  I wrote a simple pass that  
instrumented LLVM instructions with external calls into a library of  
instrumentation functions (in my case, I was printing out a trace of  
dynamic load, store, and other data).  To analyze a program, I  
compiled the program to LLVM, ran the analyze pass on it, then linked  
the instrumented program with the library (also compiled to LLVM).   
To generate the trace, I just ran the instrumented program.  For  
something very simple, like counting dynamic instructions, you could  
dispense with the library and modify the LLVM directly to do the  
following:  declare a static global counter, initialize it to zero in  
main, and increment it after every LLVM instruction in the original  
program.

Rob

On Jul 13, 2006, at 10:26 PM, Kenneth Hoste wrote:
> Chris Lattner wrote:
>> On Thu, 13 Jul 2006, Kenneth Hoste wrote:
>>> After browsing through the docs, at a first glance I think I  
>>> should write a plugin for the 'analyze' tool. I think
http://
>>> llvm.org/docs/WritingAnLLVMPass.html is where I should start from.
>>> The only problem I see now is that there doesn't seem to be a
way
>>> to get information on a single instruction while being able to  
>>> keep state over all instructions... Is that possible, and if it  
>>> is, can oyu tell me how (or where I can find an example of it?).
>> I don't really understand what you mean by this, but a ModulePass  
>> is fully general, it can do anything.  Can you explain what you're
>> trying to do?
>
> The way we're characterizing programs now, is adding our own code  
> after every instruction. When that instruction gets executed (which  
> can happen several times, for example inside a loop), we update our  
> state. A simple example is counting the number of dynamic  
> instructions executed, or the instruction mix (% loads, %  
> stores, ...) in the dynamic execution.
> If I was able to do that using LLVM, we would have characteristics  
> on a higher level of abstraction. The documentation on the  
> BasicBlock pass mentions not to keep state over different basic  
> blocks, but I do want that. Also, I need a way to iterate over the  
> _dynamic_ instruction stream. Is there a way to do that?
>
> Example static vs dynamic:
>
> static:
> L: add x, y
>    sub y, z
>    jmpif z>100
>    mul x, z
>
> dynamic:
> add x, y
> sub y, z
> jmpif z>100
> add x, y
> sub y, z
> jmpif z>100
> ...
> jmpif z>100
> mul x, z
>
>
> If my problem still isn't clear, it's because I didn't explain
it
> well. Feel free to ask further questions. I'll look into the  
> documentation/examples today, to see if I can find some kind of  
> dynamic analysis.
>
> greetings,
>
> Kennneth
>
>
> -- 
> Statistics are like a bikini. What they reveal is suggestive, but  
> what they conceal is vital (Aaron Levenstein)
>
> Kenneth Hoste
> ELIS - Ghent University
> kenneth.hoste at elis.ugent.be
> http://www.elis.ugent.be/~kehoste
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Robert L. Bocchino Jr.
Ph.D. Student
University of Illinois, Urbana-Champaign

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20060714/09175395/attachment.html>

John Criswell

2006-Jul-14 14:39 UTC

head link

[LLVMdev] LLVM bytecode simulator/emulator?

Kenneth Hoste wrote:> Chris Lattner wrote:
>> On Thu, 13 Jul 2006, Kenneth Hoste wrote:
>>> After browsing through the docs, at a first glance I think I should
>>> write a plugin for the 'analyze' tool. I think 
>>> http://llvm.org/docs/WritingAnLLVMPass.html is where I should start
>>> from.
>>> The only problem I see now is that there doesn't seem to be a
way to
>>> get information on a single instruction while being able to keep 
>>> state over all instructions... Is that possible, and if it is, can 
>>> oyu tell me how (or where I can find an example of it?).
>>
>> I don't really understand what you mean by this, but a ModulePass
is
>> fully general, it can do anything.  Can you explain what you're 
>> trying to do?
>
> The way we're characterizing programs now, is adding our own code 
> after every instruction. When that instruction gets executed (which 
> can happen several times, for example inside a loop), we update our 
> state. A simple example is counting the number of dynamic instructions 
> executed, or the instruction mix (% loads, % stores, ...) in the 
> dynamic execution.Okay.  As Rob has already said, it sounds like you want to write an LLVM 
pass that adds global variables and instructions to a program.  So, to 
state it explicitly, you want to:

1) Compile the program that you want to instrument to LLVM bytecode 
using llvm-gcc.
2) Use an LLVM pass that you write to instrument the program.
3) Use LLVM's llc program to generate C or assembly code of your 
instrumented program.
4) Compile the C/asm code to native code with gcc and link it with any 
native code libraries that it needs.
5) Run the program and gather the information from your instrumentation 
instructions.
>
> If I was able to do that using LLVM, we would have characteristics on 
> a higher level of abstraction. The documentation on the BasicBlock 
> pass mentions not to keep state over different basic blocks, but I do 
> want that. Also, I need a way to iterate over the _dynamic_ 
> instruction stream. Is there a way to do that?I think you want to write a ModulePass instead of a BasicBlock pass.

A BasicBlockPass's runOnBasicBlock() method is called by the PassManager 
for each basic block in the program.  Therefore, a BasicBlockPass cannot 
calculate some piece of information while modifying one basic block and 
use that information when modifying another basic block (i.e. it cannot 
maintain state between invocations).

Since your instrumentation pass will need to add a global variable to 
the program, a BasicBlockPass is not suitable for what you want to do.  
I would recommend using a ModulePass.  A ModulePass is given an entire 
Module (i.e. an entire LLVM bytecode file).  Your pass can add a global 
counter variable and then iterate over every instruction in the Module, 
instrumenting it as needed.

-- John T.
>
> Example static vs dynamic:
>
> static:
> L: add x, y
>    sub y, z
>    jmpif z>100
>    mul x, z
>
> dynamic:
> add x, y
> sub y, z
> jmpif z>100
> add x, y
> sub y, z
> jmpif z>100
> ...
> jmpif z>100
> mul x, z
>
>
> If my problem still isn't clear, it's because I didn't explain
it
> well. Feel free to ask further questions. I'll look into the 
> documentation/examples today, to see if I can find some kind of 
> dynamic analysis.
>
> greetings,
>
> Kennneth
>
>

Kenneth Hoste

2006-Jul-14 16:11 UTC

head link

[LLVMdev] LLVM bytecode simulator/emulator?

John Criswell wrote:> Okay.  As Rob has already said, it sounds like you want to write an LLVM 
> pass that adds global variables and instructions to a program.  So, to 
> state it explicitly, you want to:
> 
> 1) Compile the program that you want to instrument to LLVM bytecode 
> using llvm-gcc.
> 2) Use an LLVM pass that you write to instrument the program.
> 3) Use LLVM's llc program to generate C or assembly code of your 
> instrumented program.
> 4) Compile the C/asm code to native code with gcc and link it with any 
> native code libraries that it needs.
> 5) Run the program and gather the information from your instrumentation 
> instructions.
> 
That sounds indeed like what I want to do, but... I want the library 
functions (for example printf), to be counted aswell. I.e., I don't make 
any distinction between program code and library code. This is essential 
for me, because I need the analysis of the program to predict/estimate 
performance, and that's quite hard, if not impossible, without taking 
the library count into account.

Is there a way to statically compile my program (along with the library 
code) into a LLVM bytecode file? That way, I can just instrument that, 
and go on from there with steps 3-5, ignoring the 'link with any native 
code library'. The libraries I need are either pretty standard (i.e. 
glibc), or I have the code for them (so I can compile it along with the 
program).

Maybe it's possible using the lli interpreter (which is a lot slower, I 
know), instead of the analyze tool?
> Since your instrumentation pass will need to add a global variable to 
> the program, a BasicBlockPass is not suitable for what you want to do.  
> I would recommend using a ModulePass.  A ModulePass is given an entire 
> Module (i.e. an entire LLVM bytecode file).  Your pass can add a global 
> counter variable and then iterate over every instruction in the Module, 
> instrumenting it as needed.
Okay, I get that... In ATOM, you're able to iterate over all basic 
blocks, and inside each basic block iterate over all it's instructions, 
and adding instrumentation code after each instruction that way. Hence 
the confusion probably :)

Thanks!

Kenneth

-- 
Statistics are like a bikini. What they reveal is suggestive, but what 
they conceal is vital (Aaron Levenstein)

Kenneth Hoste
ELIS - Ghent University
kenneth.hoste at elis.ugent.be
http://www.elis.ugent.be/~kehoste

Chris Lattner

2006-Jul-14 17:44 UTC

head link

[LLVMdev] LLVM bytecode simulator/emulator?

On Fri, 14 Jul 2006, Kenneth Hoste wrote:>> I don't really understand what you mean by this, but a ModulePass
is fully
>> general, it can do anything.  Can you explain what you're trying to
do?
>
> The way we're characterizing programs now, is adding our own code after
every
> instruction. When that instruction gets executed (which can happen several 
> times, for example inside a loop), we update our state. A simple example is
Right.
> counting the number of dynamic instructions executed, or the instruction
mix
> (% loads, % stores, ...) in the dynamic execution.
> If I was able to do that using LLVM, we would have characteristics on a 
> higher level of abstraction. The documentation on the BasicBlock pass 
> mentions not to keep state over different basic blocks, but I do want that.
> Also, I need a way to iterate over the _dynamic_ instruction stream. Is
there
> a way to do that?
You can't iterate over dynamic instructions without running the program. 
Optimization passes happen at compile time, not runtime.  If you want 
information about the dynamic behavior of the program, either modify the 
interpreter, or insert code that computes the properties you care about.

-Chris
> Example static vs dynamic:
>
> static:
> L: add x, y
>   sub y, z
>   jmpif z>100
>   mul x, z
>
> dynamic:
> add x, y
> sub y, z
> jmpif z>100
> add x, y
> sub y, z
> jmpif z>100
> ...
> jmpif z>100
> mul x, z
>
>
> If my problem still isn't clear, it's because I didn't explain
it well. Feel
> free to ask further questions. I'll look into the
documentation/examples
> today, to see if I can find some kind of dynamic analysis.
>
> greetings,
>
> Kennneth
>
>
>
-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

Reasonably Related Threads

Search for more possibly parallel threads

llvm dev - Jul 2006 - [LLVMdev] LLVM bytecode simulator/emulator?

[LLVMdev] LLVM bytecode simulator/emulator?

[LLVMdev] LLVM bytecode simulator/emulator?

[LLVMdev] LLVM bytecode simulator/emulator?

[LLVMdev] LLVM bytecode simulator/emulator?

[LLVMdev] LLVM bytecode simulator/emulator?

Reasonably Related Threads