thr3ads.net - dtrace discuss - [dtrace-discuss] Moving average in a DTrace aggregation [Oct 2006]

If this information is useful, please help other people find it:
Share via:

Thierry Manfé

2006-Oct-19 12:08 UTC

[dtrace-discuss] Moving average in a DTrace aggregation

Hello,

    Some questions about DTrace:

    Is there a way to compute and store a moving average in an aggregation?

    Can we assigned anything else than a function to an aggregation,
    i.e. is it possible to assign the value of a variable?

    Please reply directly to me since I am not on the alias.

    Thanks,
    Thierry
-- 
	Thierry Manf? Sun Microsystems, Market Development Engineering
Phone: +33-1-34-03-01-64
Mobile: +33-6-84-62-85-10

http://partneradvantage.sun.com
http://opensolaris.org
http://netbeans.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20061019/2675d9bd/attachment.html>

Thierry Manfé

2006-Oct-19 14:35 UTC

head link

[dtrace-discuss] Re: Moving average in a DTrace aggregation

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <meta content="text/html;charset=ISO-8859-1"
http-equiv="Content-Type">
  <title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
<font size="-1"><font face="Bitstream Vera
Sans"><br>
<br>
&nbsp;&nbsp;&nbsp; Good point Jon,<br>
<br>
&nbsp;&nbsp;&nbsp; Some background: I want to check the amount of
free RAM (and other
system parameters)<br>
&nbsp;&nbsp;&nbsp; at regular time intervals. Due to the
architecture/scope of the
project,&nbsp; I want to do this<br>
&nbsp;&nbsp;&nbsp; using DTrace and its associated Java API
available in Nevada
(Sol11).<br>
<br>
&nbsp;&nbsp;&nbsp; I take for granted that the only way to do some
pooling on DTrace
data with its Java API<br>
&nbsp;&nbsp;&nbsp; is to store the data in an aggregation and then
call getAggregate().<br>
<br>
&nbsp;&nbsp;&nbsp; So, back to my example, the objective is to store
the amount of
free RAM in an aggregation.<br>
&nbsp;&nbsp;&nbsp; Here is a code snippet that shows how close I
could get from the
objective: <br>
<br>
&nbsp;&nbsp;&nbsp; profile:::tick-500msec {<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
this->ram_used&nbsp; = `availrmem - `freemem;<br>
&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;
@ram["ramUsed"] = avg(this->ram_used);<br>
&nbsp;&nbsp;&nbsp; }<br>
<br>
&nbsp;&nbsp;&nbsp; Yet, this piece of code averages the free-RAM
values from the
beginning of the DTrace execution.<br>
&nbsp;&nbsp;&nbsp; So if the free-RAM changes swiftly there will be
a gap/delay
between the real value and the one<br>
&nbsp;&nbsp;&nbsp; stored in the aggregation.<br>
<br>
&nbsp;&nbsp;&nbsp; One way to avoid that would be to compute a
moving average instead,
i.e. an average based on<br>
&nbsp;&nbsp;&nbsp; the last N values instead of all the values since
the start of the
execution. Yet, I did not found<br>
&nbsp;&nbsp;&nbsp; the right way to do this.<br>
<br>
&nbsp;&nbsp;&nbsp; Hope this helps to clarify.<br>
&nbsp;&nbsp;&nbsp; Thierry<br>
<br>
</font></font><br>
Jon Anderson a &eacute;crit&nbsp;:
<blockquote cite="mid453780C7.1080704@sun.com"
type="cite"><br>
Thierry,
  <br>
  <br>
It would probably be easier if you gave a more specific example
  <br>
of what you are trying to achieve. From what you have said so
  <br>
far, my understanding is you want to maintain a combined avg
  <br>
and an average of N iterations. If this is correct, can you not
  <br>
just use multiple aggregations for this?
  <br>
</blockquote>
</body>
</html>

Chip Bennett

2006-Oct-19 17:10 UTC

head link

[dtrace-discuss] Re: Moving average in a DTrace aggregation

Thierry Manf? wrote:>
>     I take for granted that the only way to do some pooling on DTrace 
> data with its Java API
>     is to store the data in an aggregation and then call getAggregate().That''s not the only way to pool data, but I''m not sure what
you can pull
out with the API.  If you can grab static arrays from the API, you could 
have ready the last 20 values with this:

int running, ram_used[20];
profile:::tick-500msec
{
   ram_used[running] = `availrmem - `freemem;
   running = (running + 1) % 20;
}

However, I have a suspicion that the only thing the API has access to is 
the aggregation buffer and the trace buffer, so is it possible to design 
your API interface so that it only checks the aggregation buffer when a 
certain message appears in the principal buffer?  For example:

#pragma D option quiet
BEGIN
{
   @ram["ramUsed"] = avg(0);
   running = 0;
}
profile:::tick-500msec
/ !running /
{
   clear(@ram);
}
profile:::tick-500msec
{
   ram_used  = `availrmem - `freemem;
   @ram["ramUsed"] = avg(ram_used);
   running++;
}
profile:::tick-500msec
/ running == 20 /
{
   running = 0;
   trace ("clearing\n");
}

This D program clears the @ram aggregation every 20 ticks.  However, if 
you happen to capture the current aggregation value with API just after 
it''s been cleared, you''re not going to get a very good
average.
However, this script also traces the word "clearing" when the 
aggregation reach 20 values collected, but doesn''t actually clear the 
aggregation until the tick.  Does the API allow you to capture the 
appearance of "clearing" in the trace buffer and then go get the
current
aggregation value before it gets cleared?  Again, I don''t know the 
capabilities of the API.

Chip>
>     So, back to my example, the objective is to store the amount of 
> free RAM in an aggregation.
>     Here is a code snippet that shows how close I could get from the 
> objective:
>
>     profile:::tick-500msec {
>        this->ram_used  = `availrmem - `freemem;
>         @ram["ramUsed"] = avg(this->ram_used);
>     }
>
>     Yet, this piece of code averages the free-RAM values from the 
> beginning of the DTrace execution.
>     So if the free-RAM changes swiftly there will be a gap/delay 
> between the real value and the one
>     stored in the aggregation.
>
>     One way to avoid that would be to compute a moving average 
> instead, i.e. an average based on
>     the last N values instead of all the values since the start of the 
> execution. Yet, I did not found
>     the right way to do this.
>
>     Hope this helps to clarify.
>     Thierry
>
>
> Jon Anderson a ?crit :
>>
>> Thierry,
>>
>> It would probably be easier if you gave a more specific example
>> of what you are trying to achieve. From what you have said so
>> far, my understanding is you want to maintain a combined avg
>> and an average of N iterations. If this is correct, can you not
>> just use multiple aggregations for this?
> ------------------------------------------------------------------------
>
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org

Nick Stephen

2006-Oct-20 12:17 UTC

head link

[dtrace-discuss] Re: Moving average in a DTrace aggregation

Attached is a simple program that maintains a rolling (or moving) 
average and prints it out. It uses an aggregation although it doesn''t 
need to for the task at hand.


On a side-question, what appears strange is that whilst the attached 
code works, the sequence:

dtrace:::BEGIN
{
	@result = sum(0);
}


profile:::tick-1sec
{
	clear(@result);
	@result = sum(`freemem);
	printa(@result);
}

will always display a result of zero (0) whatever the values of 
total/elems, it''s as if the ''clear'' is being executed
out-of-order with
respect to the ''sum'' aggregation, or as if the sum is being
ignored.

  [ Nick ]




Chip Bennett wrote:> Thierry Manf? wrote:
>>
>>     I take for granted that the only way to do some pooling on DTrace 
>> data with its Java API
>>     is to store the data in an aggregation and then call
getAggregate().
> That''s not the only way to pool data, but I''m not sure
what you can pull
> out with the API.  If you can grab static arrays from the API, you could 
> have ready the last 20 values with this:
> 
> int running, ram_used[20];
> profile:::tick-500msec
> {
>   ram_used[running] = `availrmem - `freemem;
>   running = (running + 1) % 20;
> }
> 
> However, I have a suspicion that the only thing the API has access to is 
> the aggregation buffer and the trace buffer, so is it possible to design 
> your API interface so that it only checks the aggregation buffer when a 
> certain message appears in the principal buffer?  For example:
> 
> #pragma D option quiet
> BEGIN
> {
>   @ram["ramUsed"] = avg(0);
>   running = 0;
> }
> profile:::tick-500msec
> / !running /
> {
>   clear(@ram);
> }
> profile:::tick-500msec
> {
>   ram_used  = `availrmem - `freemem;
>   @ram["ramUsed"] = avg(ram_used);
>   running++;
> }
> profile:::tick-500msec
> / running == 20 /
> {
>   running = 0;
>   trace ("clearing\n");
> }
> 
> This D program clears the @ram aggregation every 20 ticks.  However, if 
> you happen to capture the current aggregation value with API just after 
> it''s been cleared, you''re not going to get a very good
average.
> However, this script also traces the word "clearing" when the 
> aggregation reach 20 values collected, but doesn''t actually clear
the
> aggregation until the tick.  Does the API allow you to capture the 
> appearance of "clearing" in the trace buffer and then go get the
current
> aggregation value before it gets cleared?  Again, I don''t know the
> capabilities of the API.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: freemem.d
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20061020/75044eab/attachment.ksh>

Tom Erickson

2006-Oct-20 23:26 UTC

head link

[dtrace-discuss] Re: Moving average in a DTrace aggregation

Chip,

On Thu, Oct 19, 2006 at 12:10:14PM -0500, Chip Bennett
wrote:> Thierry Manf? wrote:
> >
> >    I take for granted that the only way to do some pooling on DTrace 
> >data with its Java API
> >    is to store the data in an aggregation and then call
getAggregate().
>If you use the printa() action, you can look at a Map view of the
Aggregation in the resulting PrintaRecord.  However, using printa()
thereafter makes the aggregation unavailable to the getAggregate()
method, so choose one way or the other to get your aggregation data and
stick to it (ProbeData notification or asynchronous getAggregate()
request). If you plan to use getAggregate(), don''t use the clear()
action at all in your DTrace program. Instead, specify which, if any,
aggregations to clear as arguments to getAggregate().
> That''s not the only way to pool data, but I''m not sure
what you can pull
> out with the API.  If you can grab static arrays from the API, you could 
> have ready the last 20 values with this:
> 
> int running, ram_used[20];
> profile:::tick-500msec
> {
>   ram_used[running] = `availrmem - `freemem;
>   running = (running + 1) % 20;
> }
> 
> However, I have a suspicion that the only thing the API has access to is 
> the aggregation buffer and the trace buffer
Right, the Java DTrace API is not meant to access D arrays, so use an
aggregation for data you want to access outside of your D program.
> , so is it possible to design 
> your API interface so that it only checks the aggregation buffer when a 
> certain message appears in the principal buffer?  For example:
> 
> #pragma D option quiet
> BEGIN
> {
>   @ram["ramUsed"] = avg(0);
>   running = 0;
> }
> profile:::tick-500msec
> / !running /
> {
>   clear(@ram);
> }
> profile:::tick-500msec
> {
>   ram_used  = `availrmem - `freemem;
>   @ram["ramUsed"] = avg(ram_used);
>   running++;
> }
> profile:::tick-500msec
> / running == 20 /
> {
>   running = 0;
>   trace ("clearing\n");
> }
> 
> This D program clears the @ram aggregation every 20 ticks.  However, if 
> you happen to capture the current aggregation value with API just after 
> it''s been cleared, you''re not going to get a very good
average.
> However, this script also traces the word "clearing" when the 
> aggregation reach 20 values collected, but doesn''t actually clear
the
> aggregation until the tick.  Does the API allow you to capture the 
> appearance of "clearing" in the trace buffer and then go get the
current
> aggregation value before it gets cleared?  Again, I don''t know the
> capabilities of the API.
> The API lets you capture the appearance of anything you write to the
trace buffer. However, getAggregate() is asynchronous with regard to the
consumer thread. Why don''t you clear the aggregation through the
getAggregate() interface rather than using the clear() action?
Otherwise, use printa() to control when you generate a PrintaRecord
relative to other probe actions.

The following diagram (installed with the API) should give a clear idea
of the API''s capabilities:

    file:///usr/share/lib/java/javadoc/dtrace/html/JavaDTraceAPI.html

Tom

Nick Stephen

2006-Oct-24 12:22 UTC

head link

[dtrace-discuss] Re: Moving average in a DTrace aggregation

FYI I''ve blogged on how to maintain a moving average in dtrace here:

http://blogs.sun.com/nickstephen/entry/dtrace_and_moving_rolling_averages

  [ Nick ]




Chip Bennett wrote:> Thierry Manf? wrote:
>>
>>     I take for granted that the only way to do some pooling on DTrace 
>> data with its Java API
>>     is to store the data in an aggregation and then call
getAggregate().
> That''s not the only way to pool data, but I''m not sure
what you can pull
> out with the API.  If you can grab static arrays from the API, you could 
> have ready the last 20 values with this:
> 
> int running, ram_used[20];
> profile:::tick-500msec
> {
>   ram_used[running] = `availrmem - `freemem;
>   running = (running + 1) % 20;
> }
> 
> However, I have a suspicion that the only thing the API has access to is 
> the aggregation buffer and the trace buffer, so is it possible to design 
> your API interface so that it only checks the aggregation buffer when a 
> certain message appears in the principal buffer?  For example:

Tom Erickson

2006-Oct-25 07:20 UTC

head link

[dtrace-discuss] Re: Moving average in a DTrace aggregation

Nick,

I liked your blog and decided to use your algorithm to add a moving
average option to Chime. I like the idea of pulling this algorithm out
of the DTrace script into Chime, because it gives the ability to turn
moving averages on and off (or change the number of values included in
the average) while the trace is still running. This is an experimental
new feature (downloadable now) accessed by right-clicking a display and
selecting "Set Moving Average ...". You can also set the initial
moving
average (the default is none) in the display creation wizard: In the
step labeled "Test Run the Display", simply click the Run Display
button
and set the moving average on the resulting test, then commit the
initial value with Finish.

If the feature seems useful, I can also add the weighted moving average
and/or the exponential moving average mentioned in your blog.  Let me
know what you think.

Earlier, you mentioned some strange behavior in a script with clear()
and printa() out of the usual order:

dtrace:::BEGIN
{
	@result = sum(0);
}

profile:::tick-1sec
{
	clear(@result);
	@result = sum(`freemem);
	printa(@result);
}

I tried it and saw the same unexpected behavior.  It seems like a bug,
even though I don''t see any reason why you''d ever need to put
clear()
ahead of printa().

Thanks,

Tom

P.S. Chip, I see now that I was addressing the wrong person in my
earlier response, that you are not the one trying to understand the Java
DTrace API. Sorry for not reading more closely. :-)

On Tue, Oct 24, 2006 at 02:22:13PM +0200, Nick Stephen
wrote:> FYI I''ve blogged on how to maintain a moving average in dtrace
here:
> 
> http://blogs.sun.com/nickstephen/entry/dtrace_and_moving_rolling_averages
> 
>  [ Nick ]
> 
> 
> 
> 
> Chip Bennett wrote:
> >Thierry Manf? wrote:
> >>
> >>    I take for granted that the only way to do some pooling on
DTrace
> >>data with its Java API
> >>    is to store the data in an aggregation and then call
getAggregate().
> >That''s not the only way to pool data, but I''m not
sure what you can pull
> >out with the API.  If you can grab static arrays from the API, you
could
> >have ready the last 20 values with this:
> >
> >int running, ram_used[20];
> >profile:::tick-500msec
> >{
> >  ram_used[running] = `availrmem - `freemem;
> >  running = (running + 1) % 20;
> >}
> >
> >However, I have a suspicion that the only thing the API has access to
is
> >the aggregation buffer and the trace buffer, so is it possible to
design
> >your API interface so that it only checks the aggregation buffer when a
> >certain message appears in the principal buffer?  For example:
> 
> 
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org

Jonathan Haslam

2006-Oct-25 11:53 UTC

head link

[dtrace-discuss] Re: Moving average in a DTrace aggregation

>profile:::tick-1sec
>{
>	clear(@result);
>	@result = sum(`freemem);
>	printa(@result);
>}
>
>I tried it and saw the same unexpected behavior.  It seems like a bug,
>even though I don''t see any reason why you''d ever need to
put clear()
>ahead of printa().
>  
>This isn''t a bug and is expected behaviour. Whilst the data
is aggregated in probe context, the trunc() and printa() actions
are executed asynchronously w.r.t the update in the user-land
consumer.

The fact that some actions are executed outside of probe context
is often forgotten and leads to behaviour which can, at first,
appear to be confusing. I think that the documentation probably
needs to be a bit stronger in this area and be more explicit.

Cheers.

Jon.

Thierry Manfé

2006-Oct-27 14:25 UTC

head link

[dtrace-discuss] Re: Moving average in a DTrace aggregation

Tom,

    Just to make sure I understand: you have implemented Nick''s 
algorithm in Chime
    i.e. in the "client" Java code not in D "server" code,
right?

    I guess the piece of D code used for this feature is still using an 
aggregation to return
    N non-averaged values, and Chime gets these values by calling 
getAggregate().

    Did I get it right?

    Thanks,
    Thierry

Tom Erickson wrote:> Nick,
>
> I liked your blog and decided to use your algorithm to add a moving
> average option to Chime. I like the idea of pulling this algorithm out
> of the DTrace script into Chime, because it gives the ability to turn
> moving averages on and off (or change the number of values included in
> the average) while the trace is still running. This is an experimental
> new feature (downloadable now) accessed by right-clicking a display and
> selecting "Set Moving Average ...". You can also set the initial
moving
> average (the default is none) in the display creation wizard: In the
> step labeled "Test Run the Display", simply click the Run Display
button
> and set the moving average on the resulting test, then commit the
> initial value with Finish.
>
> If the feature seems useful, I can also add the weighted moving average
> and/or the exponential moving average mentioned in your blog.  Let me
> know what you think.
>
> Earlier, you mentioned some strange behavior in a script with clear()
> and printa() out of the usual order:
>
> dtrace:::BEGIN
> {
> 	@result = sum(0);
> }
>
> profile:::tick-1sec
> {
> 	clear(@result);
> 	@result = sum(`freemem);
> 	printa(@result);
> }
>
> I tried it and saw the same unexpected behavior.  It seems like a bug,
> even though I don''t see any reason why you''d ever need to
put clear()
> ahead of printa().
>
> Thanks,
>
> Tom
>
> P.S. Chip, I see now that I was addressing the wrong person in my
> earlier response, that you are not the one trying to understand the Java
> DTrace API. Sorry for not reading more closely. :-)
>
> On Tue, Oct 24, 2006 at 02:22:13PM +0200, Nick Stephen wrote:
>   
>> FYI I''ve blogged on how to maintain a moving average in dtrace
here:
>>
>>
http://blogs.sun.com/nickstephen/entry/dtrace_and_moving_rolling_averages
>>
>>  [ Nick ]
>>
>>
>>
>>
>> Chip Bennett wrote:
>>     
>>> Thierry Manf? wrote:
>>>       
>>>>    I take for granted that the only way to do some pooling on
DTrace
>>>> data with its Java API
>>>>    is to store the data in an aggregation and then call
getAggregate().
>>>>         
>>> That''s not the only way to pool data, but I''m not
sure what you can pull
>>> out with the API.  If you can grab static arrays from the API, you
could
>>> have ready the last 20 values with this:
>>>
>>> int running, ram_used[20];
>>> profile:::tick-500msec
>>> {
>>>  ram_used[running] = `availrmem - `freemem;
>>>  running = (running + 1) % 20;
>>> }
>>>
>>> However, I have a suspicion that the only thing the API has access
to is
>>> the aggregation buffer and the trace buffer, so is it possible to
design
>>> your API interface so that it only checks the aggregation buffer
when a
>>> certain message appears in the principal buffer?  For example:
>>>       
>> _______________________________________________
>> dtrace-discuss mailing list
>> dtrace-discuss at opensolaris.org
>>     
-- 
	Thierry Manf? Sun Microsystems, Market Development Engineering
Phone: +33-1-34-03-01-64
Mobile: +33-6-84-62-85-10

http://partneradvantage.sun.com
http://opensolaris.org
http://netbeans.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20061027/d34e4bb4/attachment.html>

Tom Erickson

2006-Oct-27 23:18 UTC

head link

[dtrace-discuss] Re: Moving average in a DTrace aggregation

Thierry,

On Fri, Oct 27, 2006 at 04:25:34PM +0200, Thierry Manf?
wrote:> 
>    Tom,
> 
>    Just to make sure I understand: you have implemented Nick''s 
> algorithm in Chime
>    i.e. in the "client" Java code not in D "server"
code, right?
> Yes.
>    I guess the piece of D code used for this feature is still using an 
> aggregation to return
>    N non-averaged values, and Chime gets these values by calling 
> getAggregate().
> Yes, except I would remove "N" from that sentence. Making the client
responsible for N allows you to change N while the DTrace program is
running.

Try it out with the "System Calls" display, which has a sparkline that
makes it easier to see the effect of the moving average (or try the
"Plot Over Time" right-click menu action). You can display the DTrace
program and see that it''s unchanged.

Tom

Nick Stephen

2006-Oct-30 11:06 UTC

head link

[dtrace-discuss] Re: Moving average in a DTrace aggregation

Tom Erickson wrote:> Thierry,
> 
> On Fri, Oct 27, 2006 at 04:25:34PM +0200, Thierry Manf? wrote:
>>    Tom,
>>
>>    Just to make sure I understand: you have implemented Nick''s
>> algorithm in Chime
>>    i.e. in the "client" Java code not in D "server"
code, right?
>>
> Yes.
> 
>>    I guess the piece of D code used for this feature is still using an 
>> aggregation to return
>>    N non-averaged values, and Chime gets these values by calling 
>> getAggregate().
>>
> Yes, except I would remove "N" from that sentence. Making the
client
> responsible for N allows you to change N while the DTrace program is
> running.
> 
> Try it out with the "System Calls" display, which has a sparkline
that
> makes it easier to see the effect of the moving average (or try the
> "Plot Over Time" right-click menu action). You can display the
DTrace
> program and see that it''s unchanged.
The inconvenience of implementing this client-side is that it means 
pulling more data from the managed system and potentially having to 
store more data (all the samples not yet collected) on the monitored system.

The advantage of course is the ability to store all the historical raw 
data and to be able to modify the averaging or other function on the 
client side without changing the d-script.

  [ Nick ]


-- 
http://blogs.sun.com/nickstephen

dtrace discuss - Oct 2006 - Moving average in a DTrace aggregation

[dtrace-discuss] Moving average in a DTrace aggregation

[dtrace-discuss] Re: Moving average in a DTrace aggregation

[dtrace-discuss] Re: Moving average in a DTrace aggregation

[dtrace-discuss] Re: Moving average in a DTrace aggregation

[dtrace-discuss] Re: Moving average in a DTrace aggregation

[dtrace-discuss] Re: Moving average in a DTrace aggregation

[dtrace-discuss] Re: Moving average in a DTrace aggregation

[dtrace-discuss] Re: Moving average in a DTrace aggregation

[dtrace-discuss] Re: Moving average in a DTrace aggregation

[dtrace-discuss] Re: Moving average in a DTrace aggregation

[dtrace-discuss] Re: Moving average in a DTrace aggregation