Tim Cook - PAE
2007-Jul-21 00:06 UTC
[dtrace-discuss] sched provider - what do on-cpu & off-cpu signify?
*, So, I was reading the definition of sched:::on-cpu and sched:::off-cpu to mean that any thread dispatched on a CPU would have that activity bounded by on-cpu and off-cpu firings, and that the use of "timestamp" between these points would accurately show how much CPU the thread was using. My prototype script gets a little over 90% on a nearly idle system, though (this system has 2 CPUs, BTW): $ cat on-off.d #!/usr/sbin/dtrace -s #pragma D option quiet int64_t start_bolt; uint64_t start_n; int64_t util; BEGIN { start_bolt = `lbolt64; start_n = timestamp; util = 0; } sched:::on-cpu { self->on_n = timestamp; } sched:::off-cpu /self->on_n/ { util += timestamp - self->on_n; self->on_n = 0; } profile:::tick-5s { printf("elapsed (ticks): %ld\n", `lbolt64 - start_bolt); printf("elapsed (ns) : %lu\n", timestamp - start_n); printf("util (ns) : %ld\n", util); start_bolt = `lbolt64; start_n = timestamp; util = 0; } mashie[bash]# ./on-off.d elapsed (ticks): 498 elapsed (ns) : 4979669863 util (ns) : 9137442436 elapsed (ticks): 500 elapsed (ns) : 5000000822 util (ns) : 9185407490 elapsed (ticks): 500 elapsed (ns) : 4999999221 util (ns) : 9157844649 mashie[bash]# vmstat 5 4 (run at the same time) mashie ) vmstat 5 4 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr s0 s1 -- -- in sy cs us sy id 0 0 0 815384 906900 2 16 7 1 1 0 6 1 -5 0 0 646 1605 816 6 4 90 0 0 0 625700 706632 0 6 0 0 0 0 0 0 0 0 0 848 880 628 2 2 97 0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 850 909 659 2 2 96 0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 869 1297 823 2 2 96 Does on-cpu, off-cpu also pick up when threads are in idle()? If so, what is the best way to exclude that (just put clauses on fbt::idle)? Thanks, Tim -- Tim Cook Performance and Applications Engineering <> Sun Microsystems Ph: +1 650 257 4709 Ext: (70) 69841
Alexander Kolbasov
2007-Jul-21 00:27 UTC
[dtrace-discuss] sched provider - what do on-cpu & off-cpu signify?
Tim, you are observing the idle thread. Try adding sched:::on-cpu /(uintptr_t)curthread->t_startpc != (uintptr_t)`idle/ - akolb> *, > > So, I was reading the definition of sched:::on-cpu and sched:::off-cpu > to mean that any thread dispatched on a CPU would have that activity > bounded by on-cpu and off-cpu firings, and that the use of "timestamp" > between these points would accurately show how much CPU the thread was > using. > > My prototype script gets a little over 90% on a nearly idle system, though (this > system has 2 CPUs, BTW): > > $ cat on-off.d > #!/usr/sbin/dtrace -s > > #pragma D option quiet > > int64_t start_bolt; > uint64_t start_n; > int64_t util; > > BEGIN > { > start_bolt = `lbolt64; > start_n = timestamp; > util = 0; > } > > sched:::on-cpu > { > self->on_n = timestamp; > } > > > sched:::off-cpu > /self->on_n/ > { > util += timestamp - self->on_n; > self->on_n = 0; > } > > profile:::tick-5s > { > printf("elapsed (ticks): %ld\n", `lbolt64 - start_bolt); > printf("elapsed (ns) : %lu\n", timestamp - start_n); > printf("util (ns) : %ld\n", util); > start_bolt = `lbolt64; > start_n = timestamp; > util = 0; > } > > > > mashie[bash]# ./on-off.d > elapsed (ticks): 498 > elapsed (ns) : 4979669863 > util (ns) : 9137442436 > elapsed (ticks): 500 > elapsed (ns) : 5000000822 > util (ns) : 9185407490 > elapsed (ticks): 500 > elapsed (ns) : 4999999221 > util (ns) : 9157844649 > > > mashie[bash]# vmstat 5 4 (run at the same time) > mashie ) vmstat 5 4 > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr s0 s1 -- -- in sy cs us sy id > 0 0 0 815384 906900 2 16 7 1 1 0 6 1 -5 0 0 646 1605 816 6 4 90 > 0 0 0 625700 706632 0 6 0 0 0 0 0 0 0 0 0 848 880 628 2 2 97 > 0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 850 909 659 2 2 96 > 0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 869 1297 823 2 2 96 > > > Does on-cpu, off-cpu also pick up when threads are in idle()? If so, what is > the best way to exclude that (just put clauses on fbt::idle)?
Peter Lawrence
2007-Jul-21 01:21 UTC
[dtrace-discuss] sched provider - what do on-cpu & off-cpu signify?
Tim, the "idle loop" is actually a separate thread in the kernel, so the system is almost always on _some_ thread ~!~ (in fact I have a hard time imagining what your system was doing in the other 10% of its time...) here''s what I do, even though some folks don''t like its coefficient of Interface Stability... in S-11, and newer S-10, the functions named fbt:::idle_enter() and fbt:::idle_exit() are called by the Solaris idle loop in the order suggested by their names. YMMV, but this is "openSolaris" so you can always verify it with the actual idle loop source code any time you want. on the other hand if the folks hadn''t made the comment to me I would have given you the wrong function names (see pps below), so they do have a point ~!~ -Pete Lawrence. ps, the "idle" loop can be quite active!, searching the ready queue for scheduled/runable threads... pps, in S-9, and older S-10, the functions are named fbt:::set_cpu_idle() and fbt:::unset_cpu_idle() don''cha just love backwards compatability...!... Cook - PAE wrote On 07/20/07 05:06 PM,:> *, > > So, I was reading the definition of sched:::on-cpu and sched:::off-cpu > to mean that any thread dispatched on a CPU would have that activity > bounded by on-cpu and off-cpu firings, and that the use of "timestamp" > between these points would accurately show how much CPU the thread was > using. > > My prototype script gets a little over 90% on a nearly idle system, though (this > system has 2 CPUs, BTW): > > $ cat on-off.d > #!/usr/sbin/dtrace -s > > #pragma D option quiet > > int64_t start_bolt; > uint64_t start_n; > int64_t util; > > BEGIN > { > start_bolt = `lbolt64; > start_n = timestamp; > util = 0; > } > > sched:::on-cpu > { > self->on_n = timestamp; > } > > > sched:::off-cpu > /self->on_n/ > { > util += timestamp - self->on_n; > self->on_n = 0; > } > > profile:::tick-5s > { > printf("elapsed (ticks): %ld\n", `lbolt64 - start_bolt); > printf("elapsed (ns) : %lu\n", timestamp - start_n); > printf("util (ns) : %ld\n", util); > start_bolt = `lbolt64; > start_n = timestamp; > util = 0; > } > > > > mashie[bash]# ./on-off.d > elapsed (ticks): 498 > elapsed (ns) : 4979669863 > util (ns) : 9137442436 > elapsed (ticks): 500 > elapsed (ns) : 5000000822 > util (ns) : 9185407490 > elapsed (ticks): 500 > elapsed (ns) : 4999999221 > util (ns) : 9157844649 > > > mashie[bash]# vmstat 5 4 (run at the same time) > mashie ) vmstat 5 4 > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr s0 s1 -- -- in sy cs us sy id > 0 0 0 815384 906900 2 16 7 1 1 0 6 1 -5 0 0 646 1605 816 6 4 90 > 0 0 0 625700 706632 0 6 0 0 0 0 0 0 0 0 0 848 880 628 2 2 97 > 0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 850 909 659 2 2 96 > 0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 869 1297 823 2 2 96 > > > Does on-cpu, off-cpu also pick up when threads are in idle()? If so, what is > the best way to exclude that (just put clauses on fbt::idle)? > > Thanks, > Tim > >
Surya.Prakki at Sun.COM
2007-Jul-23 03:38 UTC
[dtrace-discuss] sched provider - what do on-cpu & off-cpu signify?
Idle thread always runs with priority -1; You can keep it out this way : sched:::on-cpu /curthread->t_pri != -1/ { self->on_n = timestamp; } -surya Tim Cook - PAE wrote On 07/21/07 05:36,:>*, > >So, I was reading the definition of sched:::on-cpu and sched:::off-cpu >to mean that any thread dispatched on a CPU would have that activity >bounded by on-cpu and off-cpu firings, and that the use of "timestamp" >between these points would accurately show how much CPU the thread was >using. > >My prototype script gets a little over 90% on a nearly idle system, though (this >system has 2 CPUs, BTW): > >$ cat on-off.d >#!/usr/sbin/dtrace -s > >#pragma D option quiet > >int64_t start_bolt; >uint64_t start_n; >int64_t util; > >BEGIN >{ > start_bolt = `lbolt64; > start_n = timestamp; > util = 0; >} > >sched:::on-cpu >{ > self->on_n = timestamp; >} > > >sched:::off-cpu >/self->on_n/ >{ > util += timestamp - self->on_n; > self->on_n = 0; >} > >profile:::tick-5s >{ > printf("elapsed (ticks): %ld\n", `lbolt64 - start_bolt); > printf("elapsed (ns) : %lu\n", timestamp - start_n); > printf("util (ns) : %ld\n", util); > start_bolt = `lbolt64; > start_n = timestamp; > util = 0; >} > > > >mashie[bash]# ./on-off.d >elapsed (ticks): 498 >elapsed (ns) : 4979669863 >util (ns) : 9137442436 >elapsed (ticks): 500 >elapsed (ns) : 5000000822 >util (ns) : 9185407490 >elapsed (ticks): 500 >elapsed (ns) : 4999999221 >util (ns) : 9157844649 > > >mashie[bash]# vmstat 5 4 (run at the same time) >mashie ) vmstat 5 4 > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr s0 s1 -- -- in sy cs us sy id > 0 0 0 815384 906900 2 16 7 1 1 0 6 1 -5 0 0 646 1605 816 6 4 90 > 0 0 0 625700 706632 0 6 0 0 0 0 0 0 0 0 0 848 880 628 2 2 97 > 0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 850 909 659 2 2 96 > 0 0 0 625700 706632 0 0 0 0 0 0 0 0 0 0 0 869 1297 823 2 2 96 > > >Does on-cpu, off-cpu also pick up when threads are in idle()? If so, what is >the best way to exclude that (just put clauses on fbt::idle)? > >Thanks, >Tim > > > >
Tim Cook - PAE
2007-Jul-23 23:56 UTC
[dtrace-discuss] sched provider - what do on-cpu & off-cpu signify?
All, Thanks for your suggestions. I went with Surya''s method, as it has a comparison to a constant in the clause - hopefully the lowest overhead. Thanks to Alexander as well for the example program to hide from clock() - I used it to test my script - which is now available at http://blogs.sun.com/timc - the purpose is to estimate how much CPU utilization we are seeing would not be seen via the old clock()-based accounting of S9 & earlier. Regards, Tim Surya.Prakki at Sun.COM wrote:> Idle thread always runs with priority -1; > You can keep it out this way : > > sched:::on-cpu > /curthread->t_pri != -1/ > { > self->on_n = timestamp; > } > > -surya >-- Tim Cook Performance and Applications Engineering <> Sun Microsystems