thr3ads.net - llvm dev - [LLVMdev] Automating Diagnostic Instrumentation [Apr 2004]

If this information is useful, please help other people find it:
Share via:

Reid Spencer

2004-Apr-07 11:33 UTC

[LLVMdev] Automating Diagnostic Instrumentation

Dear List,

I have some questions about some passes for LLVM that I'm thinking about
but first I need to give a little background ...

One of the things I do in my "day job" (large IT system performance 
analysis) is to figure out where to place diagnostic instrumentation 
within an application. The instrumentation I'm talking about here is the
kind that is inserted into application code to capture things like 
wall-clock latency, cpu time, number of i/o requests, etc.  For accurate
diagnosis, it is often necessary to collect that kind of data about 
every occurrence of some event or event pair (like entry/exit of a 
function).  However, this kind of instrumentation can have very high 
overhead if used liberally. When an application is under heavy load, it 
is critical to select the instrumentation points correctly so as to 
minimize overhead and maximize the utility of the information provided. 
Fortunately, there is a way to do this automatically. By constructing a 
static call graph of the application, it is possible to discover the 
"fan-out" points. Fan-out points are calls in the call chain that are 
called from very few places (typically one) but directly or indirectly 
call many functions. That is, some functions in an application are at 
the top of the call chain (e.g. like "main" which implies all the 
processing of the program) while others are at the bottom (like
"strlen"
which implies no further calls and is a leaf node in the call chain). In
between these two extremes (root and leaf),  the call chain will fan-in 
(like strlen) and fan-out (like main).  Architecturally, we can speak of
the fan-out points being the places in the application code where a 
module boundary is being crossed.  By instrumenting these fan-out points
we can be assured that we're instrumenting things that (a) have 
architectural significance (i.e. good quality of information) and (b) 
imply enough processing that the overhead of instrumentation is
negligible.

With the above kind of instrumentation in mind, I consider such 
auto-instrumentation as "just another pass" in LLVM. That is, I
believe
LLVM makes it pretty easy to do the call graph analysis, find the 
"fan-out" points, and insert the instrumentation code.  So, given that
its relatively easy to do this, I have the following questions for the
list:

(1) Would others find this kind of diagnostic instrumentation useful?

(2) Would it be useful to turn the call graph data ino a pretty picture 
via graphviz (both statically and dynamically) ?

(3) How much of this auto-instrumentation pass is already written in 
existing passes?

(4) Are there other ways to achieve the same ends?

(5) Can someone give me some tips on how to implement this?  It would be
my first LLVM pass.

On a side note, there are source language constructs for which I'm 
wanting to aggregate the data captured by the instrumentation..  Call 
chains of functions are great but sometimes you can't see the forest for
the trees because the information is too dense. What is useful is to 
aggregate the performance data at higher levels of abstraction. For 
example, in a multi-threaded, object-oriented program you might want to 
see aggregation at the process, thread, class, object instance, and 
method levels. In order to instrument for these kinds of aggregated 
views, I need to be able to provide some kind of "context" (process
id,
thread id, class, etc.) in which a data point is captured.  This implies
that I need the instrumentation pass to understand some things about the
source-level language. and possibly capture information about the 
environment the instrumentation application will run in.  Unfortunately,
that means that the pass would either become language specific or 
environment specific because I would have to ensure that the source 
language compiler added the necessary constructs to the generated code 
to provide the contextual information needed by the pass.  This raises a
couple questions more questions:

(1) Is there be a general mechanism to communicate information between 
source language compiler and LLVM pass? If there isn't, should there 
be?  In the case I describe above it would be *highly* useful (IMO) to 
have the source language compiler provide source level information for a
language independent pass to use later.

(2) Is there any existing mechanism in LLVM for providing (1) directly? 
What I'm thinking of is some kind of API  that the source language 
compiler can use to add addtional information that any subsequent pass 
might need to use.

Reid.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20040407/ab944447/attachment.sig>

Chris Lattner

2004-Apr-07 11:42 UTC

head link

[LLVMdev] Automating Diagnostic Instrumentation

On Wed, 7 Apr 2004, Reid Spencer wrote:> (2) Would it be useful to turn the call graph data ino a pretty picture
> via graphviz (both statically and dynamically) ?
Sure:  analyze -print-callgraph foo.bc
> (3) How much of this auto-instrumentation pass is already written in
> existing passes?
There are various instrumentation passes already in LLVM, for example, to
trace instructions as they execute, to insert profiling instrumentation,
etc.  These live in lib/Transforms/Instrumentation/
> (4) Are there other ways to achieve the same ends?
I dunno.  It sounds like a good approach.  :)
> (5) Can someone give me some tips on how to implement this?  It would be
> my first LLVM pass.
I would take a look at other passes that do similar things.  For example,
look at clients of the CallGraph class and look at a couple of the
instrumentation classes.  If you know what code you want to insert, it
should be fairly straight-forward.
> (1) Is there be a general mechanism to communicate information between
> source language compiler and LLVM pass?
No, not at present.
> If there isn't, should there be?  In the case I describe above it would
> be *highly* useful (IMO) to have the source language compiler provide
> source level information for a language independent pass to use later.
Sure, again, it's a matter of designing the feature so that it is broadly
applicable and fits in with the LLVM "style".  If the feature made
sense
in the context of LLVM, extending LLVM isn't a problem.  How to get more
source-level information into LLVM is an open question.
> (2) Is there any existing mechanism in LLVM for providing (1) directly?
> What I'm thinking of is some kind of API  that the source language
> compiler can use to add addtional information that any subsequent pass
> might need to use.
Not yet.  If you're interested, you check out the debugger documentation.
In the case of the debugger, the source language front-end packages up
language-specific information and passes it through LLVM.  LLVM doesn't
understand or interpret this information though, the debugger process does
in the end.

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

Vikram S. Adve

2004-Apr-07 12:03 UTC

head link

[LLVMdev] Automating Diagnostic Instrumentation

Reid,

Adding this kind of instrumentation pass would be very valuable in 
LLVM.  We don't have any such thing at present, and there could be 
multiple uses for it.

Joel Stanley did an MS thesis last year that could complement this kind 
of pass nicely.  Joel's thesis was on dynamic performance 
instrumentation guided by explicit queries within the application (I 
will forward you a copy).  Think about it as two things:

(1) A simple performance query language that allows queries to be 
embedded within an application (e.g., "how many L2 cache misses does 
the loop nest labeled X incur, in those instances when a moving average 
cost for function Y is more than 3x of long-term average, i.e., the 
function Y has been unusually slow").  The language allows the user to 
define arbitrary "metrics," to specify routines that can be used to 
measure those metrics, and then to query those metrics for arbitrary 
points or intervals within the application.  A number of common 
computational performance metrics and their measurement routines are 
predefined, e.g., elapsed user/total/system time, L1/L2 cache misses, 
TLB misses.  Many more predefined ones can be added, including OS, 
networking, and other kinds of metrics.

The query language is actually implemented as a simple API.  Joel wrote 
an LLVM pass that recognizes calls to this API, and replaces them with 
initial calls to his runtime system.

(2) A sophisticated runtime system that dynamically inserts and removes 
calls to instrumentation routines that actually do the measurement.  
This is driven by the requirements of the actual queries, e.g., for the 
example query above, you would insert instrumentation for L2 cache 
misses around loop nest X.

Your automatic pass could potentially use Joel's runtime support to do 
the actual work of inserting and removing instrumentation -- the pass 
would only have to insert the appropriate queries in our query 
"language" API.

Caveat: the runtime instrumentation library has only been lightly 
tested and isn't robust yet.

--Vikram
http://www.cs.uiuc.edu/~vadve
http://llvm.cs.uiuc.edu/

On Apr 7, 2004, at 11:32 AM, Reid Spencer wrote:
> Dear List,
>
> I have some questions about some passes for LLVM that I'm thinking 
> about
> but first I need to give a little background ...
>
> One of the things I do in my "day job" (large IT system
performance
> analysis) is to figure out where to place diagnostic instrumentation
> within an application. The instrumentation I'm talking about here is 
> the
> kind that is inserted into application code to capture things like
> wall-clock latency, cpu time, number of i/o requests, etc.  For 
> accurate
> diagnosis, it is often necessary to collect that kind of data about
> every occurrence of some event or event pair (like entry/exit of a
> function).  However, this kind of instrumentation can have very high
> overhead if used liberally. When an application is under heavy load, it
> is critical to select the instrumentation points correctly so as to
> minimize overhead and maximize the utility of the information provided.
> Fortunately, there is a way to do this automatically. By constructing a
> static call graph of the application, it is possible to discover the
> "fan-out" points. Fan-out points are calls in the call chain that
are
> called from very few places (typically one) but directly or indirectly
> call many functions. That is, some functions in an application are at
> the top of the call chain (e.g. like "main" which implies all the
> processing of the program) while others are at the bottom (like 
> "strlen"
> which implies no further calls and is a leaf node in the call chain). 
> In
> between these two extremes (root and leaf),  the call chain will fan-in
> (like strlen) and fan-out (like main).  Architecturally, we can speak 
> of
> the fan-out points being the places in the application code where a
> module boundary is being crossed.  By instrumenting these fan-out 
> points
> we can be assured that we're instrumenting things that (a) have
> architectural significance (i.e. good quality of information) and (b)
> imply enough processing that the overhead of instrumentation is
> negligible.
>
> With the above kind of instrumentation in mind, I consider such
> auto-instrumentation as "just another pass" in LLVM. That is, I
believe
> LLVM makes it pretty easy to do the call graph analysis, find the
> "fan-out" points, and insert the instrumentation code.  So, given
that
> its relatively easy to do this, I have the following questions for the
> list:
>
> (1) Would others find this kind of diagnostic instrumentation useful?
>
> (2) Would it be useful to turn the call graph data ino a pretty picture
> via graphviz (both statically and dynamically) ?
>
> (3) How much of this auto-instrumentation pass is already written in
> existing passes?
>
> (4) Are there other ways to achieve the same ends?
>
> (5) Can someone give me some tips on how to implement this?  It would 
> be
> my first LLVM pass.
>
> On a side note, there are source language constructs for which I'm
> wanting to aggregate the data captured by the instrumentation..  Call
> chains of functions are great but sometimes you can't see the forest 
> for
> the trees because the information is too dense. What is useful is to
> aggregate the performance data at higher levels of abstraction. For
> example, in a multi-threaded, object-oriented program you might want to
> see aggregation at the process, thread, class, object instance, and
> method levels. In order to instrument for these kinds of aggregated
> views, I need to be able to provide some kind of "context"
(process id,
> thread id, class, etc.) in which a data point is captured.  This 
> implies
> that I need the instrumentation pass to understand some things about 
> the
> source-level language. and possibly capture information about the
> environment the instrumentation application will run in.  
> Unfortunately,
> that means that the pass would either become language specific or
> environment specific because I would have to ensure that the source
> language compiler added the necessary constructs to the generated code
> to provide the contextual information needed by the pass.  This raises 
> a
> couple questions more questions:
>
> (1) Is there be a general mechanism to communicate information between
> source language compiler and LLVM pass? If there isn't, should there
> be?  In the case I describe above it would be *highly* useful (IMO) to
> have the source language compiler provide source level information for 
> a
> language independent pass to use later.
>
> (2) Is there any existing mechanism in LLVM for providing (1) directly?
> What I'm thinking of is some kind of API  that the source language
> compiler can use to add addtional information that any subsequent pass
> might need to use.
>
> Reid.

Reid Spencer

2004-Apr-08 02:42 UTC

head link

[LLVMdev] Automating Diagnostic Instrumentation

Vikram,

Thanks for the salient feedback.  This sounds _very_ interesting. I'm
reading Joel's thesis in my "spare" time. At 75 pages it might
take some
time. I'll respond in more detail when I have a better understanding of
what Joel has done and what support is already in LLVM.

Reid.

On Wed, 2004-04-07 at 10:01, Vikram S. Adve wrote:> Reid,
> 
> Adding this kind of instrumentation pass would be very valuable in 
> LLVM.  We don't have any such thing at present, and there could be 
> multiple uses for it.
> 
> Joel Stanley did an MS thesis last year that could complement this kind 
> of pass nicely.  Joel's thesis was on dynamic performance 
> instrumentation guided by explicit queries within the application (I 
> will forward you a copy).  Think about it as two things:
> 
> (1) A simple performance query language that allows queries to be 
> embedded within an application (e.g., "how many L2 cache misses does 
> the loop nest labeled X incur, in those instances when a moving average 
> cost for function Y is more than 3x of long-term average, i.e., the 
> function Y has been unusually slow").  The language allows the user to
> define arbitrary "metrics," to specify routines that can be used
to
> measure those metrics, and then to query those metrics for arbitrary 
> points or intervals within the application.  A number of common 
> computational performance metrics and their measurement routines are 
> predefined, e.g., elapsed user/total/system time, L1/L2 cache misses, 
> TLB misses.  Many more predefined ones can be added, including OS, 
> networking, and other kinds of metrics.
> 
> The query language is actually implemented as a simple API.  Joel wrote 
> an LLVM pass that recognizes calls to this API, and replaces them with 
> initial calls to his runtime system.
> 
> (2) A sophisticated runtime system that dynamically inserts and removes 
> calls to instrumentation routines that actually do the measurement.  
> This is driven by the requirements of the actual queries, e.g., for the 
> example query above, you would insert instrumentation for L2 cache 
> misses around loop nest X.
> 
> Your automatic pass could potentially use Joel's runtime support to do 
> the actual work of inserting and removing instrumentation -- the pass 
> would only have to insert the appropriate queries in our query 
> "language" API.
> 
> Caveat: the runtime instrumentation library has only been lightly 
> tested and isn't robust yet.
> 
> --Vikram
> http://www.cs.uiuc.edu/~vadve
> http://llvm.cs.uiuc.edu/
> 
> On Apr 7, 2004, at 11:32 AM, Reid Spencer wrote:
> 
> > Dear List,
> >
> > I have some questions about some passes for LLVM that I'm thinking
> > about
> > but first I need to give a little background ...
> >
> > One of the things I do in my "day job" (large IT system
performance
> > analysis) is to figure out where to place diagnostic instrumentation
> > within an application. The instrumentation I'm talking about here
is
> > the
> > kind that is inserted into application code to capture things like
> > wall-clock latency, cpu time, number of i/o requests, etc.  For 
> > accurate
> > diagnosis, it is often necessary to collect that kind of data about
> > every occurrence of some event or event pair (like entry/exit of a
> > function).  However, this kind of instrumentation can have very high
> > overhead if used liberally. When an application is under heavy load,
it
> > is critical to select the instrumentation points correctly so as to
> > minimize overhead and maximize the utility of the information
provided.
> > Fortunately, there is a way to do this automatically. By constructing
a
> > static call graph of the application, it is possible to discover the
> > "fan-out" points. Fan-out points are calls in the call chain
that are
> > called from very few places (typically one) but directly or indirectly
> > call many functions. That is, some functions in an application are at
> > the top of the call chain (e.g. like "main" which implies
all the
> > processing of the program) while others are at the bottom (like 
> > "strlen"
> > which implies no further calls and is a leaf node in the call chain). 
> > In
> > between these two extremes (root and leaf),  the call chain will
fan-in
> > (like strlen) and fan-out (like main).  Architecturally, we can speak 
> > of
> > the fan-out points being the places in the application code where a
> > module boundary is being crossed.  By instrumenting these fan-out 
> > points
> > we can be assured that we're instrumenting things that (a) have
> > architectural significance (i.e. good quality of information) and (b)
> > imply enough processing that the overhead of instrumentation is
> > negligible.
> >
> > With the above kind of instrumentation in mind, I consider such
> > auto-instrumentation as "just another pass" in LLVM. That
is, I believe
> > LLVM makes it pretty easy to do the call graph analysis, find the
> > "fan-out" points, and insert the instrumentation code.  So,
given that
> > its relatively easy to do this, I have the following questions for the
> > list:
> >
> > (1) Would others find this kind of diagnostic instrumentation useful?
> >
> > (2) Would it be useful to turn the call graph data ino a pretty
picture
> > via graphviz (both statically and dynamically) ?
> >
> > (3) How much of this auto-instrumentation pass is already written in
> > existing passes?
> >
> > (4) Are there other ways to achieve the same ends?
> >
> > (5) Can someone give me some tips on how to implement this?  It would 
> > be
> > my first LLVM pass.
> >
> > On a side note, there are source language constructs for which I'm
> > wanting to aggregate the data captured by the instrumentation..  Call
> > chains of functions are great but sometimes you can't see the
forest
> > for
> > the trees because the information is too dense. What is useful is to
> > aggregate the performance data at higher levels of abstraction. For
> > example, in a multi-threaded, object-oriented program you might want
to
> > see aggregation at the process, thread, class, object instance, and
> > method levels. In order to instrument for these kinds of aggregated
> > views, I need to be able to provide some kind of "context"
(process id,
> > thread id, class, etc.) in which a data point is captured.  This 
> > implies
> > that I need the instrumentation pass to understand some things about 
> > the
> > source-level language. and possibly capture information about the
> > environment the instrumentation application will run in.  
> > Unfortunately,
> > that means that the pass would either become language specific or
> > environment specific because I would have to ensure that the source
> > language compiler added the necessary constructs to the generated code
> > to provide the contextual information needed by the pass.  This raises
> > a
> > couple questions more questions:
> >
> > (1) Is there be a general mechanism to communicate information between
> > source language compiler and LLVM pass? If there isn't, should
there
> > be?  In the case I describe above it would be *highly* useful (IMO) to
> > have the source language compiler provide source level information for
> > a
> > language independent pass to use later.
> >
> > (2) Is there any existing mechanism in LLVM for providing (1)
directly?
> > What I'm thinking of is some kind of API  that the source language
> > compiler can use to add addtional information that any subsequent pass
> > might need to use.
> >
> > Reid.
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________
Reid Spencer
President & CTO
eXtensible Systems, Inc.
rspencer at x10sys.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20040408/874061bf/attachment.sig>

Reid Spencer

2004-Apr-08 02:45 UTC

head link

[LLVMdev] Automating Diagnostic Instrumentation

On Wed, 2004-04-07 at 09:45, Chris Lattner wrote:> Sure:  analyze -print-callgraph foo.bc
Cool. Didn't know about that.
> There are various instrumentation passes already in LLVM, for example, to
> trace instructions as they execute, to insert profiling instrumentation,
> etc.  These live in lib/Transforms/Instrumentation/
> ...
> I would take a look at other passes that do similar things.  For example,
> look at clients of the CallGraph class and look at a couple of the
> instrumentation classes.  If you know what code you want to insert, it
> should be fairly straight-forward.
> 
Okay, will do. I didn't know we had a CallGraph class already!

Reid.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20040408/13468d3f/attachment.sig>

Sébastien Pierre

2004-Apr-08 03:25 UTC

head link

[LLVMdev] Automating Diagnostic Instrumentation

Hi Reid,

Reid Spencer wrote:
>With the above kind of instrumentation in mind, I consider such 
>auto-instrumentation as "just another pass" in LLVM. That is, I
believe
>LLVM makes it pretty easy to do the call graph analysis, find the 
>"fan-out" points, and insert the instrumentation code.  So, given
that
>its relatively easy to do this, I have the following questions for the
>list:
>  
>First, thanks for your brilliant summary of instrumentation for 
performance analysis. I am currently working on program structure 
analysis for regression test selection and minimization, and found many 
similar points to what you describe.
>(1) Would others find this kind of diagnostic instrumentation useful?
>  
>I would: it occurs that whenever a program grows big, it is crucial to 
get measures from selected indicators, at least to identify performance 
bottlenecks or do coverage analysis (which parts of the code were used 
during a program execution). Instrumentation is a cornerstone for 
properly testing a program, without it you can't get your test coverage 
and thus lack a fundamental information to properly select which tests 
are relevant to (re)-validate your program.
>(2) Would it be useful to turn the call graph data ino a pretty picture 
>via graphviz (both statically and dynamically) ?
>  
>It may, but it may even be more interesting to output the call graph to 
a database which can be later processed. For instance, the 
gcc-introspector projet (http://introspector.sf.net) outputs to an RDF 
database, from which many things can be done. For some programs (if not 
many), the call graphs will be too complex.You can see some "static call 
graphs" (resulting from static analysis) of a GCC source file I made 
here <http://people.type-z.org/seb/bordel/sched.png>. And this is only a 
one-level call graph of one source file.
>(snip)
> This implies
>that I need the instrumentation pass to understand some things about the
>source-level language. and possibly capture information about the 
>environment the instrumentation application will run in.  Unfortunately,
>that means that the pass would either become language specific or 
>environment specific because I would have to ensure that the source 
>language compiler added the necessary constructs to the generated code 
>to provide the contextual information needed by the pass.  This raises a
>couple questions more questions:
>  
>What you are talking about is very similar to what I would like to do : 
get structural information (list of functions, what do they call, lists 
of classes, etc). This means accessing and querying a program structure 
through an API.  There was a post yesterday on LLVM/OpenC++, which is an 
attemp to offer C++ developers with an API to reflect a C++ program 
structure (they call it "meta-programming", but to me it seems mostly 
like reflectivity).
>(1) Is there be a general mechanism to communicate information between 
>source language compiler and LLVM pass? If there isn't, should there 
>be?  In the case I describe above it would be *highly* useful (IMO) to 
>have the source language compiler provide source level information for a
>language independent pass to use later.
>  
>I also think tt would be highly useful to have this information, at 
least for the following use cases :

* Add instrumentation operations at selected program points
* Allow program structural-analysis passes (which can detect for 
instance bad design in program structure, or optimise recurrent 
structural patterns)
* Introspect a program to automatically generate bindings to other 
language. Language X for LLVM could easily used libraries programmed in 
language Y for LLVM, because all program information is accessible, and 
has not to be generated using external utilites (like SWIG, for instance).

I must say that providing a compiler infrastructure that offers even 
basic reflectivity infrastructure for all front-ends would really ease 
the weaving of different programs. For instance, scripting languages 
like Python or Ruby (or anything dynamic) would have almost free 
bindings to any C or C++ library.

I hope this is possible, and am ready to help !

 -- Sébastien

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Apr 2004 - [LLVMdev] Automating Diagnostic Instrumentation

[LLVMdev] Automating Diagnostic Instrumentation

[LLVMdev] Automating Diagnostic Instrumentation

[LLVMdev] Automating Diagnostic Instrumentation

[LLVMdev] Automating Diagnostic Instrumentation

[LLVMdev] Automating Diagnostic Instrumentation

[LLVMdev] Automating Diagnostic Instrumentation

Apparently Analagous Threads