thr3ads.net - dtrace discuss - [dtrace-discuss] Pid Provider timings [Aug 2005]

If this information is useful, please help other people find it:
Share via:

Jignesh Shah

2005-Aug-26 16:42 UTC

[dtrace-discuss] Pid Provider timings

Hello All,

I am trying to write a D-script using the pid provider in the fashion that when
it dumps out

functionname    %usr %sys %idle

(In my script I track both pid:::entry,return as well as syscall:::entry,return)

I am expecting following output which is spitted out every 10 seconds intervals
or so

Process: 192

functionname   %time    %usr %sys %idle
-----
memcpy           5      80     0    20
read             5       0    80    20
write            10      0    60    40
dumbfunction     40      0     0   100
hogfunction      40    100     0     0
------
Total           100     44    10    46 

now using timestamp (or vtimestamp) I get find out time (and can normalize it to
%time).. which I think is total time %u+%idle  (similarly I can track syscalls
times too which is %s+%idle)

However the thing that''s bugging me is how to separate out the  %user,
%idle from %time.

If I do that, I can track functions which are causing the CPU to be idle when it
was expected to be doing useful work.

Also if total time is high and idle component of it is low, then it indicates
there is a potential of rewriting the function so it does not become a hog.

Can this be achieved? I think it will be great script to root out both idle
causing functions and functions thats a CPU hog.

Thanks.

Regards,
Jignesh

Jim Mauro

2005-Aug-26 17:24 UTC

head link

[dtrace-discuss] Pid Provider timings

Your goal is accomplished much easier with prstat(1).

Using prstat with -Lm (per-LWP stats, report microstates) will
tell you how much time each thread in a process is burning CPU,
versus sleeping or waiting for a CPU (there are other microstates
as well).

Use -p <PID> to target a specific process.

Once you understand, at a high level, the behaviour of the processes
and threads off interest, use DTrace to answer all remaining questions,
or other tools, depending on what the question is.

For example, if prstat indicates a particular process (or thread) is 
burning a
lot of CPU in %usr, things like system call statistics are not particularly
interesting. Rather, use the dtrace profile provider to begin profiling the
activity of the process, and understand where it''s spending
it''s time;

#dtrace -n ''profile-997 / arg1 != 0 && execname ==
"target_process / {
@s[ustack(1)] = count() /

Above is a very simple clock based profile that grabs the top of the user
stack when the probe fires. Drill down further with the PID provider.

If you''re burning time in the kernel:

#dtrace -n ''profile-997 / arg0 != 0 / { @s[stack(1)]=count()
}''

To get a quick kernel profile, or use lockstat(1), which is a dtrace 
consumer.
(#lockstat -i 997 -Ik sleep <DURATION> )

Your goal is a little ambitious (if I''m understanding your query
correctly).
That is, you''d like to accomplish in a dtrace script what we typically
accomplish using a couple iterative steps.  The early steps using
prstat(1) are so fast, and so easy, it''s not worth looking for a more 
complex
way to do it.

HTH,
/jim

Jignesh Shah wrote:
>Hello All,
>
>I am trying to write a D-script using the pid provider in the fashion that
when it dumps out
>
>functionname    %usr %sys %idle
>
>(In my script I track both pid:::entry,return as well as
syscall:::entry,return)
>
>I am expecting following output which is spitted out every 10 seconds
intervals or so
>
>Process: 192
>
>functionname   %time    %usr %sys %idle
>-----
>memcpy           5      80     0    20
>read             5       0    80    20
>write            10      0    60    40
>dumbfunction     40      0     0   100
>hogfunction      40    100     0     0
>------
>Total           100     44    10    46 
>
>now using timestamp (or vtimestamp) I get find out time (and can normalize
it to %time).. which I think is total time %u+%idle  (similarly I can track
syscalls times too which is %s+%idle)
>
>However the thing that''s bugging me is how to separate out the 
%user, %idle from %time.
>
>If I do that, I can track functions which are causing the CPU to be idle
when it was expected to be doing useful work.
>
>Also if total time is high and idle component of it is low, then it
indicates there is a potential of rewriting the function so it does not become a
hog.
>
>Can this be achieved? I think it will be great script to root out both idle
causing functions and functions thats a CPU hog.
>
>Thanks.
>
>Regards,
>Jignesh
>
>
>_______________________________________________
>dtrace-discuss mailing list
>dtrace-discuss at opensolaris.org
>  
>

Jignesh Shah

2005-Aug-26 18:05 UTC

head link

[dtrace-discuss] Pid Provider timings

Hi Jim,

Yes I like prstat -mLp. But its still at the thread level. I am thinking more in
the lines of function level and its attributed timings. Even if you get the
snapshot from prstat -mLp  and by the time you run the dtrace script for that it
might be too late to match it up against the "snapshot" from prstat
-mL. (Of course I also lack the "Jim Mauro" brain to map it up
correctly between the two results sets :-) )

Maybe we need an extended option with prstat itselt to get by function name
information out there (which prstat can do a quick dtrace call).

Infact to be overly ambitious is to figure out the IO distribution by function
name  along with the CPU timings to cover that aspect too (and the memory??).

This will add real value to "OpenSolaris" as no other OS can give such
information out of the box without learning D :-)


Regards,
Jignesh


----- Original Message -----
From: Jim Mauro <James.Mauro at Sun.COM>
Date: Friday, August 26, 2005 1:24 pm
Subject: Re: [dtrace-discuss] Pid Provider timings
> 
> Your goal is accomplished much easier with prstat(1).
> 
> Using prstat with -Lm (per-LWP stats, report microstates) will
> tell you how much time each thread in a process is burning CPU,
> versus sleeping or waiting for a CPU (there are other microstates
> as well).
> 
> Use -p <PID> to target a specific process.
> 
> Once you understand, at a high level, the behaviour of the processes
> and threads off interest, use DTrace to answer all remaining 
> questions,or other tools, depending on what the question is.
> 
> For example, if prstat indicates a particular process (or thread) 
> is 
> burning a
> lot of CPU in %usr, things like system call statistics are not 
> particularlyinteresting. Rather, use the dtrace profile provider to 
> begin profiling the
> activity of the process, and understand where it''s spending
it''s time;
> 
> #dtrace -n ''profile-997 / arg1 != 0 && execname ==
"target_process
> / { 
> @s[ustack(1)] = count() /
> 
> Above is a very simple clock based profile that grabs the top of 
> the user
> stack when the probe fires. Drill down further with the PID provider.
> 
> If you''re burning time in the kernel:
> 
> #dtrace -n ''profile-997 / arg0 != 0 / { @s[stack(1)]=count()
}''
> 
> To get a quick kernel profile, or use lockstat(1), which is a 
> dtrace 
> consumer.
> (#lockstat -i 997 -Ik sleep <DURATION> )
> 
> Your goal is a little ambitious (if I''m understanding your query 
> correctly).That is, you''d like to accomplish in a dtrace script 
> what we typically
> accomplish using a couple iterative steps.  The early steps using
> prstat(1) are so fast, and so easy, it''s not worth looking for a 
> more 
> complex
> way to do it.
> 
> HTH,
> /jim
> 
> 
> Jignesh Shah wrote:
> 
> >Hello All,
> >
> >I am trying to write a D-script using the pid provider in the 
> fashion that when it dumps out
> >
> >functionname    %usr %sys %idle
> >
> >(In my script I track both pid:::entry,return as well as 
> syscall:::entry,return)>
> >I am expecting following output which is spitted out every 10 
> seconds intervals or so
> >
> >Process: 192
> >
> >functionname   %time    %usr %sys %idle
> >-----
> >memcpy           5      80     0    20
> >read             5       0    80    20
> >write            10      0    60    40
> >dumbfunction     40      0     0   100
> >hogfunction      40    100     0     0
> >------
> >Total           100     44    10    46 
> >
> >now using timestamp (or vtimestamp) I get find out time (and can 
> normalize it to %time).. which I think is total time %u+%idle  
> (similarly I can track syscalls times too which is %s+%idle)
> >
> >However the thing that''s bugging me is how to separate out the
> %user, %idle from %time.
> >
> >If I do that, I can track functions which are causing the CPU to 
> be idle when it was expected to be doing useful work.
> >
> >Also if total time is high and idle component of it is low, then 
> it indicates there is a potential of rewriting the function so it 
> does not become a hog.
> >
> >Can this be achieved? I think it will be great script to root out 
> both idle causing functions and functions thats a CPU hog.
> >
> >Thanks.
> >
> >Regards,
> >Jignesh
> >
> >
> >_______________________________________________
> >dtrace-discuss mailing list
> >dtrace-discuss at opensolaris.org
> >  
> >
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
>

Marty Itzkowitz

2005-Aug-26 18:47 UTC

head link

[dtrace-discuss] Pid Provider timings

An alternative for user-process profiling is the collector/analyzer from
Studio.  See:

	http://dpweb.sfbay/products/analyzer
	http://dpweb.sfbay/products/analyzer/venus.html

To do the high-resolution profiling, specify "collect -p hi ..."  Time
in all
microstates is collected, and you can easily drill down to annotated
source and disassembly, march up and down the callstack, see a graphical
display of profile events in a timeline, with the callstacks, and more.

We also have a GUI on top of Dtrace for kernel profiling, invoked
as "er_kernel -p hi ..."  See:

	http://dpweb.sfbay/products/analyzer/kernel.html

    Marty

Jim Mauro wrote:> Your goal is accomplished much easier with prstat(1).
> 
> Using prstat with -Lm (per-LWP stats, report microstates) will
> tell you how much time each thread in a process is burning CPU,
> versus sleeping or waiting for a CPU (there are other microstates
> as well).
> 
> Use -p <PID> to target a specific process.
> 
> Once you understand, at a high level, the behaviour of the processes
> and threads off interest, use DTrace to answer all remaining questions,
> or other tools, depending on what the question is.
> 
> For example, if prstat indicates a particular process (or thread) is 
> burning a
> lot of CPU in %usr, things like system call statistics are not particularly
> interesting. Rather, use the dtrace profile provider to begin profiling the
> activity of the process, and understand where it''s spending
it''s time;
> 
> #dtrace -n ''profile-997 / arg1 != 0 && execname ==
"target_process / {
> @s[ustack(1)] = count() /
> 
> Above is a very simple clock based profile that grabs the top of the user
> stack when the probe fires. Drill down further with the PID provider.
> 
> If you''re burning time in the kernel:
> 
> #dtrace -n ''profile-997 / arg0 != 0 / { @s[stack(1)]=count()
}''
> 
> To get a quick kernel profile, or use lockstat(1), which is a dtrace 
> consumer.
> (#lockstat -i 997 -Ik sleep <DURATION> )
> 
> Your goal is a little ambitious (if I''m understanding your query
correctly).
> That is, you''d like to accomplish in a dtrace script what we
typically
> accomplish using a couple iterative steps.  The early steps using
> prstat(1) are so fast, and so easy, it''s not worth looking for a
more
> complex
> way to do it.
> 
> HTH,
> /jim
> 
> 
> Jignesh Shah wrote:
> 
> 
>>Hello All,
>>
>>I am trying to write a D-script using the pid provider in the fashion
that when it dumps out
>>
>>functionname    %usr %sys %idle
>>
>>(In my script I track both pid:::entry,return as well as
syscall:::entry,return)
>>
>>I am expecting following output which is spitted out every 10 seconds
intervals or so
>>
>>Process: 192
>>
>>functionname   %time    %usr %sys %idle
>>-----
>>memcpy           5      80     0    20
>>read             5       0    80    20
>>write            10      0    60    40
>>dumbfunction     40      0     0   100
>>hogfunction      40    100     0     0
>>------
>>Total           100     44    10    46 
>>
>>now using timestamp (or vtimestamp) I get find out time (and can
normalize it to %time).. which I think is total time %u+%idle  (similarly I can
track syscalls times too which is %s+%idle)
>>
>>However the thing that''s bugging me is how to separate out the 
%user, %idle from %time.
>>
>>If I do that, I can track functions which are causing the CPU to be idle
when it was expected to be doing useful work.
>>
>>Also if total time is high and idle component of it is low, then it
indicates there is a potential of rewriting the function so it does not become a
hog.
>>
>>Can this be achieved? I think it will be great script to root out both
idle causing functions and functions thats a CPU hog.
>>
>>Thanks.
>>
>>Regards,
>>Jignesh
>>
>>
>>_______________________________________________
>>dtrace-discuss mailing list
>>dtrace-discuss at opensolaris.org
>> 
>>
> 
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org

Bryan Cantrill

2005-Aug-26 19:23 UTC

head link

[dtrace-discuss] Pid Provider timings

On Fri, Aug 26, 2005 at 02:05:55PM -0400, Jignesh Shah
wrote:> Hi Jim,
> 
> Yes I like prstat -mLp. But its still at the thread level. I am thinking
more in the lines of function level and its attributed timings. Even if you get
the snapshot from prstat -mLp  and by the time you run the dtrace script for
that it might be too late to match it up against the "snapshot" from
prstat -mL. (Of course I also lack the "Jim Mauro" brain to map it up
correctly between the two results sets :-) )
> 
> Maybe we need an extended option with prstat itselt to get by function name
information out there (which prstat can do a quick dtrace call).
For whatever it''s worth, this will kind of analysis will become much
easier
once I have integrated the fix for 6311947:

  http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6311947

The wad containing this fix is in the final stages of testing; it will
integrate over the weekend or (more likely) very early next week.  Look
for it in an upcoming SX/OpenSolaris release, at any rate...

(Unfortunately, the presentation of the bug database doesn''t
differentiate
comments; the suggestion of f_contains() and m_contains() in the referenced
bug report is not mine but Roch Bourbonnais'' -- and while (still) a
good
RFE, it''s ultimately unrelated to 6311947.)

	- Bryan

--------------------------------------------------------------------------
Bryan Cantrill, Solaris Kernel Development.       http://blogs.sun.com/bmc

Adam Leventhal

2005-Aug-29 15:24 UTC

head link

[dtrace-discuss] Pid Provider timings

Hi Jignesh,

I''m not sure if this is precisely the question you''re asking,
but you can
find out when a thread comes off the CPU (or goes to sleep or is woken up)
using the sched provider. If you''re trying to understand the duration
for
which a function causes a thread to sleep, the sched provider will probably
be helpful.

Adam

On Fri, Aug 26, 2005 at 12:42:20PM -0400, Jignesh Shah
wrote:> Hello All,
> 
> I am trying to write a D-script using the pid provider in the fashion that
when it dumps out
> 
> functionname    %usr %sys %idle
> 
> (In my script I track both pid:::entry,return as well as
syscall:::entry,return)
> 
> I am expecting following output which is spitted out every 10 seconds
intervals or so
> 
> Process: 192
> 
> functionname   %time    %usr %sys %idle
> -----
> memcpy           5      80     0    20
> read             5       0    80    20
> write            10      0    60    40
> dumbfunction     40      0     0   100
> hogfunction      40    100     0     0
> ------
> Total           100     44    10    46 
> 
> now using timestamp (or vtimestamp) I get find out time (and can normalize
it to %time).. which I think is total time %u+%idle  (similarly I can track
syscalls times too which is %s+%idle)
> 
> However the thing that''s bugging me is how to separate out the 
%user, %idle from %time.
> 
> If I do that, I can track functions which are causing the CPU to be idle
when it was expected to be doing useful work.
> 
> Also if total time is high and idle component of it is low, then it
indicates there is a potential of rewriting the function so it does not become a
hog.
> 
> Can this be achieved? I think it will be great script to root out both idle
causing functions and functions thats a CPU hog.
> 
> Thanks.
> 
> Regards,
> Jignesh
> 
> 
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
-- 
Adam Leventhal, Solaris Kernel Development       http://blogs.sun.com/ahl

Jignesh Shah

2005-Aug-29 15:34 UTC

head link

[dtrace-discuss] Pid Provider timings

I have been thinking about the sched provider...

and also spent playing with the script using sched provider...

Here are some fundamental queries that I am not sure about (based on my limited
knowledge of the internals)

* When function boundaries are crossed.. can it be safely assumed that it is ON
cpu ? (meaning all the ticks starting from the function call can be assumed as
CPU time.. till the next Sched OFF-event)?

* When a function is about to be returned.. Again can it be OFF CPU (meaning it
has to be ON CPU to fire the return probe?)


* Also a fundamentally stupid question: When a userland function calls a system
call, does the self->ts variable in both probes is the same variable? 
(Relation of Thread Local Storage between Userland calls and System calls)


This will help me to get started.

Thanks.

Regards,
Jignesh


----- Original Message -----
From: Adam Leventhal <ahl at eng.sun.com>
Date: Monday, August 29, 2005 11:24 am
Subject: Re: [dtrace-discuss] Pid Provider timings
> Hi Jignesh,
> 
> I''m not sure if this is precisely the question you''re
asking, but
> you can
> find out when a thread comes off the CPU (or goes to sleep or is 
> woken up)
> using the sched provider. If you''re trying to understand the 
> duration for
> which a function causes a thread to sleep, the sched provider will 
> probablybe helpful.
> 
> Adam
> 
> On Fri, Aug 26, 2005 at 12:42:20PM -0400, Jignesh Shah wrote:
> > Hello All,
> > 
> > I am trying to write a D-script using the pid provider in the 
> fashion that when it dumps out
> > 
> > functionname    %usr %sys %idle
> > 
> > (In my script I track both pid:::entry,return as well as 
> syscall:::entry,return)> 
> > I am expecting following output which is spitted out every 10 
> seconds intervals or so
> > 
> > Process: 192
> > 
> > functionname   %time    %usr %sys %idle
> > -----
> > memcpy           5      80     0    20
> > read             5       0    80    20
> > write            10      0    60    40
> > dumbfunction     40      0     0   100
> > hogfunction      40    100     0     0
> > ------
> > Total           100     44    10    46 
> > 
> > now using timestamp (or vtimestamp) I get find out time (and can 
> normalize it to %time).. which I think is total time %u+%idle  
> (similarly I can track syscalls times too which is %s+%idle)
> > 
> > However the thing that''s bugging me is how to separate out
the
> %user, %idle from %time.
> > 
> > If I do that, I can track functions which are causing the CPU to 
> be idle when it was expected to be doing useful work.
> > 
> > Also if total time is high and idle component of it is low, then 
> it indicates there is a potential of rewriting the function so it 
> does not become a hog.
> > 
> > Can this be achieved? I think it will be great script to root out 
> both idle causing functions and functions thats a CPU hog.
> > 
> > Thanks.
> > 
> > Regards,
> > Jignesh
> > 
> > 
> > _______________________________________________
> > dtrace-discuss mailing list
> > dtrace-discuss at opensolaris.org
> 
> -- 
> Adam Leventhal, Solaris Kernel Development       
> http://blogs.sun.com/ahl

Jonathan Adams

2005-Aug-29 16:31 UTC

head link

[dtrace-discuss] Pid Provider timings

On Mon, Aug 29, 2005 at 11:34:16AM -0400, Jignesh Shah
wrote:> I have been thinking about the sched provider...
> 
> and also spent playing with the script using sched provider...
> 
> Here are some fundamental queries that I am not sure about (based on my
limited knowledge of the internals)
> 
> * When function boundaries are crossed.. can it be safely assumed that
> it is ON cpu ? (meaning all the ticks starting from the function call
> can be assumed as CPU time.. till the next Sched OFF-event)?
No; there are a couple types of kernel functions which are called between the
off-cpu probe and the on-cpu probe:

	1. softints (see
	   http://cvs.opensolaris.org/source/search?refs=add_softintr)
	2. the "save" and "restore" context callbacks (see
	   http://cvs.opensolaris.org/source/search?refs=installctx)

These generally don''t take much time, and the timing difference can be
ignored.

No userland functions will be called while the thread is off-cpu, of course.
> * When a function is about to be returned.. Again can it be OFF CPU
> (meaning it has to be ON CPU to fire the return probe?)
Any of the above functions, if called, will have to return before the thread
can proceed to the on-cpu probe, so yes.
> * Also a fundamentally stupid question: When a userland function calls
> a system call, does the self->ts variable in both probes is the same
> variable?  (Relation of Thread Local Storage between Userland calls
> and System calls)
Yes; thread local variables are associated with the kernel thread, which is
the same whether or not you are executing in userland.

Hope that helps,
- jonathan

-- 
Jonathan Adams, Solaris Kernel Development

Adam Leventhal

2005-Aug-29 17:01 UTC

head link

[dtrace-discuss] Pid Provider timings

On Mon, Aug 29, 2005 at 11:34:16AM -0400, Jignesh Shah
wrote:> * When function boundaries are crossed.. can it be safely assumed
> that it is ON cpu ? (meaning all the ticks starting from the function
> call can be assumed as CPU time.. till the next Sched OFF-event)?
>
> * When a function is about to be returned.. Again can it be OFF CPU
> (meaning it has to be ON CPU to fire the return probe?)
Any time a probe fires the executing thread is -- by definition -- on CPU.
There are a couple of kernel functions that may fire on the way to a thread
coming on and off the CPU.
> * Also a fundamentally stupid question: When a userland function
> calls a system call, does the self->ts variable in both probes is the
> same variable?  (Relation of Thread Local Storage between Userland
> calls and System calls)
Thread-local variables persist between user-land and the kernel. The only
caveat is that for profile provider probes should be thought of as firing
in their own thread -- thread-local variables set in the thread that''s
interrupted by the profile probe aren''t visible from the context of the
profile probe. For example, this script will generate no output:

---8<--- tls.d ---8<---

#!/usr/sbin/dtrace -s

sched:::on-cpu
{
        self->yes = 1;
}

profile:::profile-4999
/self->yes/
{
}

---8<--- tls.d ---8<---

Hope that helps.

Adam

-- 
Adam Leventhal, Solaris Kernel Development       http://blogs.sun.com/ahl

dtrace discuss - Aug 2005 - Pid Provider timings

[dtrace-discuss] Pid Provider timings

[dtrace-discuss] Pid Provider timings

[dtrace-discuss] Pid Provider timings

[dtrace-discuss] Pid Provider timings

[dtrace-discuss] Pid Provider timings

[dtrace-discuss] Pid Provider timings

[dtrace-discuss] Pid Provider timings

[dtrace-discuss] Pid Provider timings

[dtrace-discuss] Pid Provider timings