thr3ads.net - dtrace discuss - [dtrace-discuss] profile provider: is it me doing stupid things? [Feb 2007]

If this information is useful, please help other people find it:
Share via:

Paul van den Bogaard

2007-Feb-15 12:49 UTC

[dtrace-discuss] profile provider: is it me doing stupid things?

Just showing someone how great DTrace is and then we spot something I do not
understand.
Of course it can be a major misunderstanding of myself. Would appreciate another
(expert) look upon this.

The goal we try to achieve is trying to get insight if there is a bursty nature
in the time slot when system calls are done. Following DTrace snippet is tried:

dtrace -q -p 3173 -n ''syscall::read:entry  { @[probefunc] = count(); } 
profile:::tick-250ms { printa("%6s %@6d\n", @);
printf("\n"); clear(@); }''

This  to chop up a one second interval into four different equal sized 250
millsec slots. Counting the number of read and write syscalls that fall in each
slot.
This one is run against a process that reads a single character and writes that
character (stdin, stdout) continuously. Wrote it myself so I dare say it is
functionally correct :-)

The output I get is:

  read  15078
  read      0
  read      0
  read      0

And this pattern continues till hell freezes over: first slot has all the
values, others are always zero. Tried this with 200msec and 100msec intervals
too. No difference.

Mmmm I expect more of a uniform distribution. Lets try to sample in DTrace a
little different. In the end it might be an unforeseen side effect of the
application :)

The following DTrace script was used:

BEGIN
{
        baseline = timestamp;
}

syscall::read:entry
/pid == $target/
{
        this->tmp = (timestamp - baseline) / (1000 * 1000 * 250);
        @[this->tmp] = count();
}

profile:::tick-1000ms
{
        baseline = timestamp;
        printa("%3d  %@u\n", @);
        clear(@);
}

END
{
}

And this one does show a uniform distribution of the system calls as I expected:
  3  10601
  2  10610
  1  10940
  0  11534

Question: why is the profile:::tick-xxxmsec behaving this way? What am I
missing?

Thanks
Paul
 
 
This message posted from opensolaris.org

Chip Bennett

2007-Feb-15 16:04 UTC

head link

[dtrace-discuss] profile provider: is it me doing stupid things?

Paul,

I haven''t checked this by experiment, but you''re clearing the 
aggregation values at four times the default aggrate (1 second), which 
is how often the dtrace command consumes the aggregation data.  You 
might try changing the aggrate to be the same as the tick.

Chip

Paul van den Bogaard wrote:> Just showing someone how great DTrace is and then we spot something I do
not understand.
> Of course it can be a major misunderstanding of myself. Would appreciate
another (expert) look upon this.
>
> The goal we try to achieve is trying to get insight if there is a bursty
nature in the time slot when system calls are done. Following DTrace snippet is
tried:
>
> dtrace -q -p 3173 -n ''syscall::read:entry  { @[probefunc] =
count(); }  profile:::tick-250ms { printa("%6s %@6d\n", @);
printf("\n"); clear(@); }''
>
> This  to chop up a one second interval into four different equal sized 250
millsec slots. Counting the number of read and write syscalls that fall in each
slot.
> This one is run against a process that reads a single character and writes
that character (stdin, stdout) continuously. Wrote it myself so I dare say it is
functionally correct :-)
>
> The output I get is:
>
>   read  15078
>   read      0
>   read      0
>   read      0
>
> And this pattern continues till hell freezes over: first slot has all the
values, others are always zero. Tried this with 200msec and 100msec intervals
too. No difference.
>
> Mmmm I expect more of a uniform distribution. Lets try to sample in DTrace
a little different. In the end it might be an unforeseen side effect of the
application :)
>
> The following DTrace script was used:
>
> BEGIN
> {
>         baseline = timestamp;
> }
>
> syscall::read:entry
> /pid == $target/
> {
>         this->tmp = (timestamp - baseline) / (1000 * 1000 * 250);
>         @[this->tmp] = count();
> }
>
> profile:::tick-1000ms
> {
>         baseline = timestamp;
>         printa("%3d  %@u\n", @);
>         clear(@);
> }
>
> END
> {
> }
>
> And this one does show a uniform distribution of the system calls as I
expected:
>   3  10601
>   2  10610
>   1  10940
>   0  11534
>
> Question: why is the profile:::tick-xxxmsec behaving this way? What am I
missing?
>
> Thanks
> Paul
>  
>  
> This message posted from opensolaris.org
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
>

Bryan Cantrill

2007-Feb-15 17:22 UTC

head link

[dtrace-discuss] profile provider: is it me doing stupid things?

On Thu, Feb 15, 2007 at 04:49:45AM -0800, Paul van den Bogaard
wrote:> Just showing someone how great DTrace is and then we spot something I do
not understand.
> Of course it can be a major misunderstanding of myself. Would appreciate
another (expert) look upon this.
> 
> The goal we try to achieve is trying to get insight if there is a bursty
nature in the time slot when system calls are done. Following DTrace snippet is
tried:
> 
> dtrace -q -p 3173 -n ''syscall::read:entry  { @[probefunc] =
count(); }  profile:::tick-250ms { printa("%6s %@6d\n", @);
printf("\n"); clear(@); }''
> 
> This  to chop up a one second interval into four different equal sized 250
millsec slots. Counting the number of read and write syscalls that fall in each
slot.
> This one is run against a process that reads a single character and writes
that character (stdin, stdout) continuously. Wrote it myself so I dare say it is
functionally correct :-)
> 
> The output I get is:
> 
>   read  15078
>   read      0
>   read      0
>   read      0
> 
> And this pattern continues till hell freezes over: first slot has all the
values, others are always zero. Tried this with 200msec and 100msec intervals
too. No difference.
> 
> Mmmm I expect more of a uniform distribution. Lets try to sample in DTrace
a little different. In the end it might be an unforeseen side effect of the
application :)
It''s actually a side-effect of both the switchrate and the aggrate. 
The
switchrate dictates how frequently libdtrace(3LIB) reads the principal
buffer from dtrace(7D); the aggrate dictates how frequently the aggregation
buffers are read and processed.  Because your printa() both (1) emits data
to the principal buffer (to indicate that the printa() action should be
taken at user-level and (2) implicitly expects the aggregation to have
been updated at at least the same frequency, your script will only work
as you expect as long as your profile probe is of a lower frequency than
_both_ the switchrate _and_ the aggrate.  And, as you might have guessed
by now, both the switchrate and the aggrate default to 1hz.  

So:  to get your script to behave as you (rightly) expected, add this to
the command line:

   -x switchrate=250ms -x aggrate=250ms

While both the aggrate and the switchrate _are_ documented, their
interaction in a problem like this is quite subtle, and not explicitly
called out anywhere in the documentation; apologies for that...

	- Bryan

--------------------------------------------------------------------------
Bryan Cantrill, Solaris Kernel Development.       http://blogs.sun.com/bmc

Maybe Matching Threads

Search for more reasonably related threads

dtrace discuss - Feb 2007 - profile provider: is it me doing stupid things?

[dtrace-discuss] profile provider: is it me doing stupid things?

[dtrace-discuss] profile provider: is it me doing stupid things?

[dtrace-discuss] profile provider: is it me doing stupid things?

Maybe Matching Threads