thr3ads.net - dtrace discuss - [dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps? [Feb 2008]

If this information is useful, please help other people find it:
Share via:

Sean Liu

2008-Feb-13 19:58 UTC

[dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps?

Hi there,

On one of my clusters  I have following issues.
This particular cluster has two V240s with 2 CPUs each.
1. The CPU usage is high, often 100%, with both high sy and us time ( about 1:1
)
2. mutex is high, which goes as high as 6000+ / CPU (!)
3. cross call is high, as high as 7000/CPU

The cluster is running following stuff:
1. Veritas Cluster Server 5.0MP1, with 20 something service groups
2. MeasureWare from HP OVO

Before you say "upgrade the hardware!", there are a few problems 
1. The mutex is so high, that adding more CPUs would just worsen the problem. 
2. xcall is high, additional CPUs will have the burden too.

So here are the things I did:
1. Shutdown VCS, with everything else running.
user space cpu usage dropped about 20-35%
kernel space cpu usage dropped by about 30%
mutex dropped from thousands to tens with brief spikes of hundreds

2. Shutdown measure ware
CPU usage became 92-99% idle!
mutex dropped further, but it''s of no significance now.

3. With measure ware dowe, started up VCS
Now mutex does not go high a lot
cpu util in sy went high a little
but CPU usage in user space went high quite a lot, idle time goes to 0% often

So in short, measure ware and VCS is killing the box.

lockstat shows proc filesystem locking ( mutex spin! ) is high:
root at cbbceusmq2d:/tmp#more lockstat.out

Adaptive mutex spin: 68526 events in 5.048 seconds (13575 events/sec)

Count indv cuml rcnt     spin Lock                   Caller
-------------------------------------------------------------------------------
67056  98%  98% 0.00        3 pidlock                pr_readdir_procdir+0x74
  272   0%  98% 0.00        3 pidlock                pr_lookup_procdir+0xa0
  216   0%  99% 0.00        4 0x6000abd0200          cv_wait_sig+0x13c

Which is not too much of a surprise for VCS - lots of monitoring are simply
running versions of "ps" to check if a process is up and running

I can simply go to veritas and say their product is a piece of sh*t, but I
wonder:
1. Is Sun Cluster any better on this front?
2. For most of these checking/monitoring, is it possible to provide some kind of
non-locking proc operations so this specific type of applications can use?


--
This message posted from opensolaris.org

Sean Liu

2008-Feb-13 20:01 UTC

head link

[dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps?

I also meant to ask how to measure cpu time of a process and it''s
children?
Let''s say we have an agent, it starts up monitoring periodically -  how
do we measure how much CPU time this agent and it''s children ( and
grand children ) used?


--
This message posted from opensolaris.org

Brendan Gregg - Sun Microsystems

2008-Feb-14 00:01 UTC

head link

[dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps?

G''Day Sean,

On Wed, Feb 13, 2008 at 12:01:08PM -0800, Sean Liu
wrote:> I also meant to ask how to measure cpu time of a process and it''s
children?
> Let''s say we have an agent, it starts up monitoring periodically -
how do we measure how much CPU time this agent and it''s children ( and
grand children ) used?
DTrace can do this using the sched provider and vtimestamp, or in rough
terms using profile and progenyof(), eg:

# dtrace -n ''profile-1001 /pid == $target || progenyof($target)/ {
	@["on-CPU, 1001 Hertz count:"] = count(); }'' -p 929727
dtrace: description ''profile-1001 '' matched 1 probe
^C

  on-CPU, 1001 Hertz count:                                       638

So PID 929727 was on-CPU for 638 samples at 1001 Hertz.

However process CPU time is already tracked by microstate accounting and
provided via procfs, and can be shown using /usr/bin/ptime.  This does
include children time (the man page needs updating).  There are other
ways to get to this procfs data...

Brendan

-- 
Brendan
[CA, USA]

Sean Liu

2008-Feb-14 17:13 UTC

head link

[dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps?

Hi Brendan,

Thanks for your reply and I''ll look into it. Both ways look quite
promising.

Sean


--
This message posted from opensolaris.org

Sean Liu

2008-Feb-14 20:41 UTC

head link

[dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps?

It turned out ps wasn''t really what was using most CPU resource. The MQ
agent ( and its probes ) is.
I put in a small script to output continuously every second:
#!/usr/sbin/dtrace -qs

profile-1001
/pid == $target || progenyof($target)/
{
 @a["on-CPU, 1001 Hertz count:"] = count();
}

tick-1
{
 printa(@a);
 trunc(@a)
}

This is from Application Agent:
  on-CPU, 1001 Hertz count:                                         2
  on-CPU, 1001 Hertz count:                                         2
  on-CPU, 1001 Hertz count:                                         2
  on-CPU, 1001 Hertz count:                                       510
  on-CPU, 1001 Hertz count:                                       410
  on-CPU, 1001 Hertz count:                                       677
  on-CPU, 1001 Hertz count:                                       796
  on-CPU, 1001 Hertz count:                                       750
  on-CPU, 1001 Hertz count:                                       828
  on-CPU, 1001 Hertz count:                                       518
  on-CPU, 1001 Hertz count:                                       751
  on-CPU, 1001 Hertz count:                                      1086
  on-CPU, 1001 Hertz count:                                       604
  on-CPU, 1001 Hertz count:                                       992
  on-CPU, 1001 Hertz count:                                       568
  on-CPU, 1001 Hertz count:                                       736
 ...

And MQ agent:
...
  on-CPU, 1001 Hertz count:                                      1696
  on-CPU, 1001 Hertz count:                                      1743
  on-CPU, 1001 Hertz count:                                      1784
  on-CPU, 1001 Hertz count:                                      1679
  on-CPU, 1001 Hertz count:                                      1691
  on-CPU, 1001 Hertz count:                                      1547
 ...

Since profile probes fire on both CPUs, a 1001 Hz probe should fire 2002 times a
second correct? So 1743 fires is almost like 85% of CPU used.

Thanks Brendan for the idea.

Sean


--
This message posted from opensolaris.org

Brendan Gregg - Sun Microsystems

2008-Feb-14 21:05 UTC

head link

[dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps?

G''Day Sean,

On Thu, Feb 14, 2008 at 12:41:35PM -0800, Sean Liu
wrote:> It turned out ps wasn''t really what was using most CPU resource.
The MQ agent ( and its probes ) is.
> I put in a small script to output continuously every second:
> #!/usr/sbin/dtrace -qs
> 
> profile-1001
> /pid == $target || progenyof($target)/
> {
>  @a["on-CPU, 1001 Hertz count:"] = count();
> }
> 
> tick-1
> {
>  printa(@a);
>  trunc(@a)
> }
Yes, this makes sense.
> This is from Application Agent:
>   on-CPU, 1001 Hertz count:                                         2
>   on-CPU, 1001 Hertz count:                                         2
>   on-CPU, 1001 Hertz count:                                         2
>   on-CPU, 1001 Hertz count:                                       510
>   on-CPU, 1001 Hertz count:                                       410
>   on-CPU, 1001 Hertz count:                                       677
>   on-CPU, 1001 Hertz count:                                       796
>   on-CPU, 1001 Hertz count:                                       750
>   on-CPU, 1001 Hertz count:                                       828
>   on-CPU, 1001 Hertz count:                                       518
>   on-CPU, 1001 Hertz count:                                       751
>   on-CPU, 1001 Hertz count:                                      1086
>   on-CPU, 1001 Hertz count:                                       604
>   on-CPU, 1001 Hertz count:                                       992
>   on-CPU, 1001 Hertz count:                                       568
>   on-CPU, 1001 Hertz count:                                       736
>  ...
> 
> And MQ agent:
> ...
>   on-CPU, 1001 Hertz count:                                      1696
>   on-CPU, 1001 Hertz count:                                      1743
>   on-CPU, 1001 Hertz count:                                      1784
>   on-CPU, 1001 Hertz count:                                      1679
>   on-CPU, 1001 Hertz count:                                      1691
>   on-CPU, 1001 Hertz count:                                      1547
>  ...
> 
> Since profile probes fire on both CPUs, a 1001 Hz probe should fire 2002
times a second correct?
Yes, it will fire 2002 times a second.  There may be a tiny variance (such
as 0.1%) in the output (due to how aggregates are printed from the user-land
dtrace consumer, and how it is CPU scheduled) - but the numbers will sum
correctly.
> So 1743 fires is almost like 85% of CPU used.
Yes, it strongly suggests that.  We don''t know that for certain since
this
is a simple (yet useful) script which samples rather than traces.  For what
you are doing (identifying largest CPU consumer) sampling is probably
adequate.

You can get it to print out percentages, if that helps readability.
Something like:

profile-1001
/pid == $target || progenyof($target)/
{
	@a["on-CPU, percentage:"] = count();
}

tick-1
{
	normalize(@a, (1001 * `ncpus_online) / 100);
	printa(@a);
	trunc(@a)
}

The normalize() won''t be spot on, since I believe the float will be 
converted to an integer before it is used.  So there will be some small
rounding error, in addition to that of it being sample based.  But again,
if the point is to identify large CPU consumers - it should work fine.

Those percentages are for all CPUs.  Drop the "* `ncpus_online" for
per-CPU percentages.
> Thanks Brendan for the idea.
no worries,

Brendan

-- 
Brendan
[CA, USA]

dtrace discuss - Feb 2008 - How to measure the CPU time of a process and its children? And non-lock ps?

[dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps?

[dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps?

[dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps?

[dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps?

[dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps?

[dtrace-discuss] How to measure the CPU time of a process and its children? And non-lock ps?