Tim Cook - PAE
2007-Jul-21 00:06 UTC
[dtrace-discuss] sched provider - what do on-cpu & off-cpu signify?
*,
So, I was reading the definition of sched:::on-cpu and sched:::off-cpu
to mean that any thread dispatched on a CPU would have that activity
bounded by on-cpu and off-cpu firings, and that the use of "timestamp"
between these points would accurately show how much CPU the thread was
using.
My prototype script gets a little over 90% on a nearly idle system, though (this
system has 2 CPUs, BTW):
$ cat on-off.d
#!/usr/sbin/dtrace -s
#pragma D option quiet
int64_t start_bolt;
uint64_t start_n;
int64_t util;
BEGIN
{
start_bolt = `lbolt64;
start_n = timestamp;
util = 0;
}
sched:::on-cpu
{
self->on_n = timestamp;
}
sched:::off-cpu
/self->on_n/
{
util += timestamp - self->on_n;
self->on_n = 0;
}
profile:::tick-5s
{
printf("elapsed (ticks): %ld\n", `lbolt64 - start_bolt);
printf("elapsed (ns) : %lu\n", timestamp - start_n);
printf("util (ns) : %ld\n", util);
start_bolt = `lbolt64;
start_n = timestamp;
util = 0;
}
mashie[bash]# ./on-off.d
elapsed (ticks): 498
elapsed (ns) : 4979669863
util (ns) : 9137442436
elapsed (ticks): 500
elapsed (ns) : 5000000822
util (ns) : 9185407490
elapsed (ticks): 500
elapsed (ns) : 4999999221
util (ns) : 9157844649
mashie[bash]# vmstat 5 4 (run at the same time)
mashie ) vmstat 5 4
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 -- -- in sy cs us sy id
0 0 0 815384 906900 2 16 7 1 1 0 6 1 -5 0 0 646 1605 816 6 4 90
0 0 0 625700 706632 0 6 0 0 0 0 0 0 0 0 0 848 880 628 2 2 97
0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 850 909 659 2 2 96
0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 869 1297 823 2 2 96
Does on-cpu, off-cpu also pick up when threads are in idle()? If so, what is
the best way to exclude that (just put clauses on fbt::idle)?
Thanks,
Tim
--
Tim Cook
Performance and Applications Engineering
<> Sun Microsystems
Ph: +1 650 257 4709
Ext: (70) 69841
Alexander Kolbasov
2007-Jul-21 00:27 UTC
[dtrace-discuss] sched provider - what do on-cpu & off-cpu signify?
Tim, you are observing the idle thread. Try adding sched:::on-cpu /(uintptr_t)curthread->t_startpc != (uintptr_t)`idle/ - akolb> *, > > So, I was reading the definition of sched:::on-cpu and sched:::off-cpu > to mean that any thread dispatched on a CPU would have that activity > bounded by on-cpu and off-cpu firings, and that the use of "timestamp" > between these points would accurately show how much CPU the thread was > using. > > My prototype script gets a little over 90% on a nearly idle system, though (this > system has 2 CPUs, BTW): > > $ cat on-off.d > #!/usr/sbin/dtrace -s > > #pragma D option quiet > > int64_t start_bolt; > uint64_t start_n; > int64_t util; > > BEGIN > { > start_bolt = `lbolt64; > start_n = timestamp; > util = 0; > } > > sched:::on-cpu > { > self->on_n = timestamp; > } > > > sched:::off-cpu > /self->on_n/ > { > util += timestamp - self->on_n; > self->on_n = 0; > } > > profile:::tick-5s > { > printf("elapsed (ticks): %ld\n", `lbolt64 - start_bolt); > printf("elapsed (ns) : %lu\n", timestamp - start_n); > printf("util (ns) : %ld\n", util); > start_bolt = `lbolt64; > start_n = timestamp; > util = 0; > } > > > > mashie[bash]# ./on-off.d > elapsed (ticks): 498 > elapsed (ns) : 4979669863 > util (ns) : 9137442436 > elapsed (ticks): 500 > elapsed (ns) : 5000000822 > util (ns) : 9185407490 > elapsed (ticks): 500 > elapsed (ns) : 4999999221 > util (ns) : 9157844649 > > > mashie[bash]# vmstat 5 4 (run at the same time) > mashie ) vmstat 5 4 > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr s0 s1 -- -- in sy cs us sy id > 0 0 0 815384 906900 2 16 7 1 1 0 6 1 -5 0 0 646 1605 816 6 4 90 > 0 0 0 625700 706632 0 6 0 0 0 0 0 0 0 0 0 848 880 628 2 2 97 > 0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 850 909 659 2 2 96 > 0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 869 1297 823 2 2 96 > > > Does on-cpu, off-cpu also pick up when threads are in idle()? If so, what is > the best way to exclude that (just put clauses on fbt::idle)?
Peter Lawrence
2007-Jul-21 01:21 UTC
[dtrace-discuss] sched provider - what do on-cpu & off-cpu signify?
Tim,
the "idle loop" is actually a separate thread in the kernel,
so the system is almost always on _some_ thread ~!~ (in fact I
have a hard time imagining what your system was doing in the other
10% of its time...)
here''s what I do, even though some folks don''t like its
coefficient
of Interface Stability...
in S-11, and newer S-10, the functions named
fbt:::idle_enter()
and
fbt:::idle_exit()
are called by the Solaris idle loop in the order suggested by
their names.
YMMV, but this is "openSolaris" so you can always verify it with
the actual idle loop source code any time you want. on the other
hand if the folks hadn''t made the comment to me I would have given you
the wrong function names (see pps below), so they do have a point ~!~
-Pete Lawrence.
ps, the "idle" loop can be quite active!, searching the ready queue
for scheduled/runable threads...
pps, in S-9, and older S-10, the functions are named
fbt:::set_cpu_idle()
and
fbt:::unset_cpu_idle()
don''cha just love backwards compatability...!...
Cook - PAE wrote On 07/20/07 05:06 PM,:> *,
>
> So, I was reading the definition of sched:::on-cpu and sched:::off-cpu
> to mean that any thread dispatched on a CPU would have that activity
> bounded by on-cpu and off-cpu firings, and that the use of
"timestamp"
> between these points would accurately show how much CPU the thread was
> using.
>
> My prototype script gets a little over 90% on a nearly idle system, though
(this
> system has 2 CPUs, BTW):
>
> $ cat on-off.d
> #!/usr/sbin/dtrace -s
>
> #pragma D option quiet
>
> int64_t start_bolt;
> uint64_t start_n;
> int64_t util;
>
> BEGIN
> {
> start_bolt = `lbolt64;
> start_n = timestamp;
> util = 0;
> }
>
> sched:::on-cpu
> {
> self->on_n = timestamp;
> }
>
>
> sched:::off-cpu
> /self->on_n/
> {
> util += timestamp - self->on_n;
> self->on_n = 0;
> }
>
> profile:::tick-5s
> {
> printf("elapsed (ticks): %ld\n", `lbolt64 - start_bolt);
> printf("elapsed (ns) : %lu\n", timestamp - start_n);
> printf("util (ns) : %ld\n", util);
> start_bolt = `lbolt64;
> start_n = timestamp;
> util = 0;
> }
>
>
>
> mashie[bash]# ./on-off.d
> elapsed (ticks): 498
> elapsed (ns) : 4979669863
> util (ns) : 9137442436
> elapsed (ticks): 500
> elapsed (ns) : 5000000822
> util (ns) : 9185407490
> elapsed (ticks): 500
> elapsed (ns) : 4999999221
> util (ns) : 9157844649
>
>
> mashie[bash]# vmstat 5 4 (run at the same time)
> mashie ) vmstat 5 4
> kthr memory page disk faults cpu
> r b w swap free re mf pi po fr de sr s0 s1 -- -- in sy cs us
sy id
> 0 0 0 815384 906900 2 16 7 1 1 0 6 1 -5 0 0 646 1605 816 6
4 90
> 0 0 0 625700 706632 0 6 0 0 0 0 0 0 0 0 0 848 880 628 2
2 97
> 0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 850 909 659 2
2 96
> 0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 869 1297 823 2
2 96
>
>
> Does on-cpu, off-cpu also pick up when threads are in idle()? If so, what
is
> the best way to exclude that (just put clauses on fbt::idle)?
>
> Thanks,
> Tim
>
>
Surya.Prakki at Sun.COM
2007-Jul-23 03:38 UTC
[dtrace-discuss] sched provider - what do on-cpu & off-cpu signify?
Idle thread always runs with priority -1;
You can keep it out this way :
sched:::on-cpu
/curthread->t_pri != -1/
{
self->on_n = timestamp;
}
-surya
Tim Cook - PAE wrote On 07/21/07 05:36,:
>*,
>
>So, I was reading the definition of sched:::on-cpu and sched:::off-cpu
>to mean that any thread dispatched on a CPU would have that activity
>bounded by on-cpu and off-cpu firings, and that the use of
"timestamp"
>between these points would accurately show how much CPU the thread was
>using.
>
>My prototype script gets a little over 90% on a nearly idle system, though
(this
>system has 2 CPUs, BTW):
>
>$ cat on-off.d
>#!/usr/sbin/dtrace -s
>
>#pragma D option quiet
>
>int64_t start_bolt;
>uint64_t start_n;
>int64_t util;
>
>BEGIN
>{
> start_bolt = `lbolt64;
> start_n = timestamp;
> util = 0;
>}
>
>sched:::on-cpu
>{
> self->on_n = timestamp;
>}
>
>
>sched:::off-cpu
>/self->on_n/
>{
> util += timestamp - self->on_n;
> self->on_n = 0;
>}
>
>profile:::tick-5s
>{
> printf("elapsed (ticks): %ld\n", `lbolt64 - start_bolt);
> printf("elapsed (ns) : %lu\n", timestamp - start_n);
> printf("util (ns) : %ld\n", util);
> start_bolt = `lbolt64;
> start_n = timestamp;
> util = 0;
>}
>
>
>
>mashie[bash]# ./on-off.d
>elapsed (ticks): 498
>elapsed (ns) : 4979669863
>util (ns) : 9137442436
>elapsed (ticks): 500
>elapsed (ns) : 5000000822
>util (ns) : 9185407490
>elapsed (ticks): 500
>elapsed (ns) : 4999999221
>util (ns) : 9157844649
>
>
>mashie[bash]# vmstat 5 4 (run at the same time)
>mashie ) vmstat 5 4
> kthr memory page disk faults cpu
> r b w swap free re mf pi po fr de sr s0 s1 -- -- in sy cs us sy
id
> 0 0 0 815384 906900 2 16 7 1 1 0 6 1 -5 0 0 646 1605 816 6 4
90
> 0 0 0 625700 706632 0 6 0 0 0 0 0 0 0 0 0 848 880 628 2 2
97
> 0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 850 909 659 2 2
96
> 0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 869 1297 823 2 2
96
>
>
>Does on-cpu, off-cpu also pick up when threads are in idle()? If so, what
is
>the best way to exclude that (just put clauses on fbt::idle)?
>
>Thanks,
>Tim
>
>
>
>
Tim Cook - PAE
2007-Jul-23 23:56 UTC
[dtrace-discuss] sched provider - what do on-cpu & off-cpu signify?
All, Thanks for your suggestions. I went with Surya''s method, as it has a comparison to a constant in the clause - hopefully the lowest overhead. Thanks to Alexander as well for the example program to hide from clock() - I used it to test my script - which is now available at http://blogs.sun.com/timc - the purpose is to estimate how much CPU utilization we are seeing would not be seen via the old clock()-based accounting of S9 & earlier. Regards, Tim Surya.Prakki at Sun.COM wrote:> Idle thread always runs with priority -1; > You can keep it out this way : > > sched:::on-cpu > /curthread->t_pri != -1/ > { > self->on_n = timestamp; > } > > -surya >-- Tim Cook Performance and Applications Engineering <> Sun Microsystems