Hi, Need some dtrace pointers to track time spent by some programs. N processes with N threads each running on a multiprocessor system. What probes to use to decompose total run time to on-proc, sleep, io etc? Since it''s a muliprocessor system, I am guessing that the toal units of time will be in the terms of " clock ticks" and so will be the break up. So for example on a 2X2Ghz system a program running for 10 sec has, theoretically 10*2*2*1024*1024 clock ticks potentially available. How to track how much of this the program is using (with a breakup of on-cpu and io wait etc) Or am I completely off on this with a different approach? Also what will the additional impact of tracking such usage with Dtrace. Thanks -- This message posted from opensolaris.org
Ryan Johnson
2009-Jul-10 16:22 UTC
[dtrace-discuss] track program time - need some guidelines
tester wrote:> Hi, > > Need some dtrace pointers to track time spent by some programs. N processes with N threads each running on a multiprocessor system. What probes to use to decompose total run time to on-proc, sleep, io etc? > Since it''s a muliprocessor system, I am guessing that the toal units of time will be in the terms of " clock ticks" and so will be the break up. > So for example on a 2X2Ghz system a program running for 10 sec has, > theoretically 10*2*2*1024*1024 clock ticks potentially available. How to track how much of this the program is using (with a breakup of on-cpu and io wait etc) > Or am I completely off on this with a different approach? >Sounds like you want the Sun Studio performance analyzer to me... man collect Ryan
James Litchfield
2009-Jul-11 07:58 UTC
[dtrace-discuss] track program time - need some guidelines
Solaris 10 uses microstate accounting all the time. This information is always tracked in the kernel for each LWP. Each running userland thread is tied to an LWP. There are 10 states defined in /usr/include/sys/msacct.h. Something such as this should help you get started. I used open as a convenient point to spit stuff out. Don''t forget -C in the invocation. #include <sys/thread.h> #include <sys/klwp.h> #include <sys/msacct.h> syscall::open:entry /pid > 0/ { this->lwp = curthread->t_lwp; printf("Current USER time: %d for thread %u in pid %u\n", this->lwp->lwp_mstate.ms_acct[LMS_USER], curthread->t_tid, pid); } tester wrote:> Hi, > > Need some dtrace pointers to track time spent by some programs. N processes with N threads each running on a multiprocessor system. What probes to use to decompose total run time to on-proc, sleep, io etc? > Since it''s a muliprocessor system, I am guessing that the toal units of time will be in the terms of " clock ticks" and so will be the break up. > So for example on a 2X2Ghz system a program running for 10 sec has, > theoretically 10*2*2*1024*1024 clock ticks potentially available. How to track how much of this the program is using (with a breakup of on-cpu and io wait etc) > Or am I completely off on this with a different approach? > > Also what will the additional impact of tracking such usage with Dtrace. > > Thanks >
James Litchfield
2009-Jul-11 08:01 UTC
[dtrace-discuss] track program time - need some guidelines
In case it''s not clear, each state transition is tracked and updated using gethrtime() in the kernel when it occurs. It''s not done by statistical sampling when the clock tick fires. Jim ---- James Litchfield wrote:> Solaris 10 uses microstate accounting all the time. This information > is always tracked in the kernel for each LWP. Each running userland > thread is tied to an LWP. > > There are 10 states defined in /usr/include/sys/msacct.h. > > Something such as this should help you get started. I used open as a > convenient > point to spit stuff out. Don''t forget -C in the invocation. > > #include <sys/thread.h> > #include <sys/klwp.h> > #include <sys/msacct.h> > > syscall::open:entry > /pid > 0/ > { > this->lwp = curthread->t_lwp; > printf("Current USER time: %d for thread %u in pid %u\n", > this->lwp->lwp_mstate.ms_acct[LMS_USER], > curthread->t_tid, pid); > } > > > > > tester wrote: >> Hi, >> >> Need some dtrace pointers to track time spent by some programs. N >> processes with N threads each running on a multiprocessor system. >> What probes to use to decompose total run time to on-proc, sleep, io >> etc? >> Since it''s a muliprocessor system, I am guessing that the toal units >> of time will be in the terms of " clock ticks" and so will be the >> break up. >> So for example on a 2X2Ghz system a program running for 10 sec has, >> theoretically 10*2*2*1024*1024 clock ticks potentially available. How >> to track how much of this the program is using (with a breakup of >> on-cpu and io wait etc) >> Or am I completely off on this with a different approach? >> >> Also what will the additional impact of tracking such usage with Dtrace. >> >> Thanks >> > >
> > syscall::open:entry > > /pid > 0/ > > { > > this->lwp = curthread->t_lwp; > > printf("Current USER time: %d for thread %u > in pid %u\n", > > this->lwp->lwp_mstate.ms_acct[LMS_USER], > > curthread->t_tid, pid); > > }Thanks Jim. Wouldn''t this miss the case if a thread doesn''t make any system calls. Also, I am assuming all syscalls from pid needs to be monitored. Is there a noticeable perf impact to pid from that approach? WIll something from sched provider make it easier? Thanks -- This message posted from opensolaris.org
Jim Mauro
2009-Jul-11 15:17 UTC
[dtrace-discuss] track program time - need some guidelines
What problem are you trying to solve? Use "prstat -Lm", or "prstat -Lmp <PID>", at 1 second intervals for a fixed duration, and do the math. The various columns produced by "pstat -m" (USR SYS SLP, etc) represent the percentage of time the sampled thread spent in each microstate for the last sampling period (except the last 4 columns, which are counts). Chad Myhier improved ptime to include reporting of time spent in various microstates. I can send you a binary - are you on x86 or SPARC? You should also explore extended accounting. HTH, /jim tester wrote:> Hi, > > Need some dtrace pointers to track time spent by some programs. N processes with N threads each running on a multiprocessor system. What probes to use to decompose total run time to on-proc, sleep, io etc? > Since it''s a muliprocessor system, I am guessing that the toal units of time will be in the terms of " clock ticks" and so will be the break up. > So for example on a 2X2Ghz system a program running for 10 sec has, > theoretically 10*2*2*1024*1024 clock ticks potentially available. How to track how much of this the program is using (with a breakup of on-cpu and io wait etc) > Or am I completely off on this with a different approach? > > Also what will the additional impact of tracking such usage with Dtrace. > > Thanks >
James Litchfield
2009-Jul-11 15:37 UTC
[dtrace-discuss] track program time - need some guidelines
If you''re looking for something that will dump out the accumulated time of the LWPs in an MT app then another mechanism might better. a /proc reader that iterated over all the LWPs for the process would be one possibility. DTrace may well not be the best way to do this thing. Jim --- Jim Mauro wrote:> What problem are you trying to solve? > > Use "prstat -Lm", or "prstat -Lmp <PID>", > at 1 second intervals for a fixed duration, > and do the math. The various columns produced > by "pstat -m" (USR SYS SLP, etc) represent the > percentage of time the sampled thread spent in each > microstate for the last sampling period (except the > last 4 columns, which are counts). > > Chad Myhier improved ptime to include reporting > of time spent in various microstates. I can send you > a binary - are you on x86 or SPARC? > > You should also explore extended accounting. > > HTH, > /jim > > > tester wrote: >> Hi, >> >> Need some dtrace pointers to track time spent by some programs. N >> processes with N threads each running on a multiprocessor system. >> What probes to use to decompose total run time to on-proc, sleep, io >> etc? >> Since it''s a muliprocessor system, I am guessing that the toal units >> of time will be in the terms of " clock ticks" and so will be the >> break up. >> So for example on a 2X2Ghz system a program running for 10 sec has, >> theoretically 10*2*2*1024*1024 clock ticks potentially available. How >> to track how much of this the program is using (with a breakup of >> on-cpu and io wait etc) >> Or am I completely off on this with a different approach? >> >> Also what will the additional impact of tracking such usage with Dtrace. >> >> Thanks >> > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
Thanks Jim.M and thanks Jim.L. Sure, I can use prstat and post-process. I am not sure if I can get a accurate profile of the app. If I can go back to my previous example and assume an multi-threaded app running for 100 clock ticks, I would like to have a break up and say usr:20ticks,sys:10,wait1:20,wait2:20,wait3:20,sleep:10. Units of time can be anything meaningful. Basically, a much more accurate answer for where my app spend it''s time than what the traditional tools offer. I will explore extended accounting and Jim.L suggestion of using /proc. I have limitation''s with my C programming skills. Thanks -- This message posted from opensolaris.org
Marty Itzkowitz
2009-Jul-12 04:39 UTC
[dtrace-discuss] track program time - need some guidelines
The Sun Studio Performance Tools do statistical callstack profiling of an application, and collects the profile ticks for all microstates. They do not track state transitions, but they give pretty meaningful statistical results over minutes, and perhaps even tens of seconds, especially if you do high-resolution profiling. See: http://developers.sun.com/sunstudio/ and: http://developers.sun.com/sunstudio/overview/topics/analyzer_index.html Marty Itzkowitz On 7/11/09 7:02 PM, tester wrote:> Thanks Jim.M and thanks Jim.L. Sure, I can use prstat and post-process. I am not sure if I can get a accurate profile of the app. If I can go back to my previous example and assume an multi-threaded app running for 100 clock ticks, I would like to have a break up and say usr:20ticks,sys:10,wait1:20,wait2:20,wait3:20,sleep:10. Units of time can be anything meaningful. Basically, a much more accurate answer for where my app spend it''s time than what the traditional tools offer. > > I will explore extended accounting and Jim.L suggestion of using /proc. I have limitation''s with my C programming skills. > > Thanks >
Jim Mauro
2009-Jul-12 15:45 UTC
[dtrace-discuss] track program time - need some guidelines
So......why is prstat with microstates not sufficient? Observing your app with; prstat -Lcmp <PID> 1 10 > prstat.out Will give you statistics collected at 1 second intervals for 10 seconds. You can do the math if you really need to get down to clock ticks (roughly). Chad''s modified ptime will give you time spent in each microstate. Also, as per Marty''s suggestion, the Studio Analyzer, which is free, can do high resolution profiling. Writing C code to extract data from /proc, which would work, is like spending a lot of time and tools removing a car engine completely from the top, just so you can change the oil filter, when all you had to do was slide under the car to do it. Unless you''re goal is skill development or this is an academic exercise, I''d love to know what you''re trying to accomplish - what problem you''re trying to solve. On a modern microprocessor with sub-nanosecond clocking, do you really need clock-tick level accounting? Thanks, /jim tester wrote:> Thanks Jim.M and thanks Jim.L. Sure, I can use prstat and post-process. I am not sure if I can get a accurate profile of the app. If I can go back to my previous example and assume an multi-threaded app running for 100 clock ticks, I would like to have a break up and say usr:20ticks,sys:10,wait1:20,wait2:20,wait3:20,sleep:10. Units of time can be anything meaningful. Basically, a much more accurate answer for where my app spend it''s time than what the traditional tools offer. > > I will explore extended accounting and Jim.L suggestion of using /proc. I have limitation''s with my C programming skills. > > Thanks >
James Litchfield
2009-Jul-12 17:28 UTC
[dtrace-discuss] track program time - need some guidelines
It goes back to what problem is being solved. A) ptime requires that it start the app. It can''t attach to a running app. It aggregates all the data and does not offer a per-lwp breakdown. Given the mechanisms it uses, that''s all it can do. If that''s adequate then the problem is solved. B) adding up prstat -Lm values seems fairly clunky but it would work and I''m sure that one could script something up to do the work. C) extended accounting says it supports ''mstate'' but the demo program (/usr/demo/libexacct/exdump.c) doesn''t emit it. This is all that comes out... 100b cpu-user-sec 0 100c cpu-user-nsec 3585639 100d cpu-sys-sec 0 100e cpu-sys-nsec 4956882 Reviewing /usr/include/sys/exacct_catalog.h would indicate that, if the information is indeed collected, it is not obviously extractable. D) Actually a /proc reader wouldn''t work as the needed information is not available in any of the structures. You''d need a kmem reader to get the info which is definitely on the tricky side. As a note to ''tester'', microstate accounting is not tied to ticks of statistical profiling. The kernel tracks the time that an LWP spends in each state it''s in. It''s far more accurate than tick based accounting. Jim ---- Jim Mauro wrote:> > So......why is prstat with microstates not sufficient? > Observing your app with; > prstat -Lcmp <PID> 1 10 > prstat.out > > Will give you statistics collected at 1 second intervals for > 10 seconds. You can do the math if you really need to > get down to clock ticks (roughly). > > Chad''s modified ptime will give you time spent in each microstate. > > Also, as per Marty''s suggestion, the Studio Analyzer, which is free, > can do high resolution profiling. > > Writing C code to extract data from /proc, which would work, is like > spending a lot of time and tools removing a car engine completely > from the top, just so you can change the oil filter, when all you had > to do was slide under the car to do it. > > Unless you''re goal is skill development or this is an academic exercise, > I''d love to know what you''re trying to accomplish - what problem > you''re trying to solve. On a modern microprocessor with > sub-nanosecond clocking, do you really need clock-tick level > accounting? > > Thanks, > /jim > > > tester wrote: >> Thanks Jim.M and thanks Jim.L. Sure, I can use prstat and >> post-process. I am not sure if I can get a accurate profile of the >> app. If I can go back to my previous example and assume an >> multi-threaded app running for 100 clock ticks, I would like to have >> a break up and say >> usr:20ticks,sys:10,wait1:20,wait2:20,wait3:20,sleep:10. Units of time >> can be anything meaningful. Basically, a much more accurate answer >> for where my app spend it''s time than what the traditional tools offer. >> >> I will explore extended accounting and Jim.L suggestion of using >> /proc. I have limitation''s with my C programming skills. >> Thanks >> > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Chad Mynhier
2009-Jul-12 19:13 UTC
[dtrace-discuss] track program time - need some guidelines
On Sun, Jul 12, 2009 at 1:28 PM, James Litchfield<James.Litchfield at sun.com> wrote:> It goes back to what problem is being solved. > > A) ptime requires that it start the app. It can''t > ? ?attach to a running app. It aggregates all the > ? ?data and does not offer a per-lwp breakdown. > ? ?Given the mechanisms it uses, that''s all it can do. > ? ?If that''s adequate then the problem is solved.Actually, the version of ptime that Jim''s talking about _can_ attach to a running app. I added the -p option to do just that. (Of course, it won''t attach to a running app and then wait until it completes, it only attaches long enough to grab a snapshot of the microstate stats.) For example: # ptime -mp `pgrep nscd` real 29:45:57.427092032 user 1.852038785 sys 2.133489810 trap 0.005027352 tflt 0.000000000 dflt 0.000009597 kflt 0.000000000 lock 29:45:54.115500466 slp 773:53:38.737153459 lat 0.105180668 stop 0.017284688 # Chad
James Litchfield
2009-Jul-12 21:17 UTC
[dtrace-discuss] track program time - need some guidelines
Agreed. I missed that. Since it does only the process, is that good enough? Jim ---- Chad Mynhier wrote:> On Sun, Jul 12, 2009 at 1:28 PM, James > Litchfield<James.Litchfield at sun.com> wrote: > >> It goes back to what problem is being solved. >> >> A) ptime requires that it start the app. It can''t >> attach to a running app. It aggregates all the >> data and does not offer a per-lwp breakdown. >> Given the mechanisms it uses, that''s all it can do. >> If that''s adequate then the problem is solved. >> > > Actually, the version of ptime that Jim''s talking about _can_ attach > to a running app. I added the -p option to do just that. (Of course, > it won''t attach to a running app and then wait until it completes, it > only attaches long enough to grab a snapshot of the microstate stats.) > > For example: > > # ptime -mp `pgrep nscd` > > real 29:45:57.427092032 > user 1.852038785 > sys 2.133489810 > trap 0.005027352 > tflt 0.000000000 > dflt 0.000009597 > kflt 0.000000000 > lock 29:45:54.115500466 > slp 773:53:38.737153459 > lat 0.105180668 > stop 0.017284688 > # > > Chad > >
James Litchfield
2009-Jul-12 21:20 UTC
[dtrace-discuss] track program time - need some guidelines
It''s also Nevada/OpenSolaris only. If you want microstate accounting on S10 you need something else. Unless it was back ported to S10 without the man page being updated? Jim ---- James Litchfield wrote:> Agreed. I missed that. Since it does only the process, > is that good enough? > > Jim > ---- > Chad Mynhier wrote: >> On Sun, Jul 12, 2009 at 1:28 PM, James >> Litchfield<James.Litchfield at sun.com> wrote: >> >>> It goes back to what problem is being solved. >>> >>> A) ptime requires that it start the app. It can''t >>> attach to a running app. It aggregates all the >>> data and does not offer a per-lwp breakdown. >>> Given the mechanisms it uses, that''s all it can do. >>> If that''s adequate then the problem is solved. >>> >> >> Actually, the version of ptime that Jim''s talking about _can_ attach >> to a running app. I added the -p option to do just that. (Of course, >> it won''t attach to a running app and then wait until it completes, it >> only attaches long enough to grab a snapshot of the microstate stats.) >> >> For example: >> >> # ptime -mp `pgrep nscd` >> >> real 29:45:57.427092032 >> user 1.852038785 >> sys 2.133489810 >> trap 0.005027352 >> tflt 0.000000000 >> dflt 0.000009597 >> kflt 0.000000000 >> lock 29:45:54.115500466 >> slp 773:53:38.737153459 >> lat 0.105180668 >> stop 0.017284688 >> # >> >> Chad >> >> > >
Chad Mynhier
2009-Jul-12 21:40 UTC
[dtrace-discuss] track program time - need some guidelines
No, it''s not been back-ported to S10, but the binary works on S10 with no problem if you copy it over. (It doesn''t use any interfaces introduced since Solaris 10.) Chad On Sun, Jul 12, 2009 at 5:20 PM, James Litchfield<James.Litchfield at sun.com> wrote:> It''s also Nevada/OpenSolaris only. > If you want microstate accounting on S10 you > need something else. > > Unless it was back ported to S10 without the man page being > updated? > > Jim > ---- > James Litchfield wrote: >> >> Agreed. I missed that. Since it does only the process, >> is that good enough? >> >> Jim >> ---- >> Chad Mynhier wrote: >>> >>> On Sun, Jul 12, 2009 at 1:28 PM, James >>> Litchfield<James.Litchfield at sun.com> wrote: >>> >>>> >>>> It goes back to what problem is being solved. >>>> >>>> A) ptime requires that it start the app. It can''t >>>> ? attach to a running app. It aggregates all the >>>> ? data and does not offer a per-lwp breakdown. >>>> ? Given the mechanisms it uses, that''s all it can do. >>>> ? If that''s adequate then the problem is solved. >>>> >>> >>> Actually, the version of ptime that Jim''s talking about _can_ attach >>> to a running app. ?I added the -p option to do just that. ?(Of course, >>> it won''t attach to a running app and then wait until it completes, it >>> only attaches long enough to grab a snapshot of the microstate stats.) >>> >>> For example: >>> >>> # ptime -mp `pgrep nscd` >>> >>> real 29:45:57.427092032 >>> user ? ? ? ?1.852038785 >>> sys ? ? ? ? 2.133489810 >>> trap ? ? ? ?0.005027352 >>> tflt ? ? ? ?0.000000000 >>> dflt ? ? ? ?0.000009597 >>> kflt ? ? ? ?0.000000000 >>> lock 29:45:54.115500466 >>> slp ?773:53:38.737153459 >>> lat ? ? ? ? 0.105180668 >>> stop ? ? ? ?0.017284688 >>> # >>> >>> Chad >>> >>> >> >> > >
Thanks everyone. A quick question, where would be the time be accounted for a thread that wait for a write/read syscall to a block device? I would only like to know the synchronous app wait from an app perspective irrespective of all the async activities that supports it. I havn''t looked at Sun studio yet since I don''t think I can install it our env. Time for me go and re-read the thread again,I guess. Thanks again. -- This message posted from opensolaris.org
Jim Mauro
2009-Jul-13 13:20 UTC
[dtrace-discuss] track program time - need some guidelines
Threads are put to sleep when they issue blocking system calls. I''m not sure exactly what you mean by "only know synchronous app wait from an app perspective....", but the threads will be in the sleep state (SLP) after issuing the read/write until the data is available, at which point a wakeup is issued and the thread goes to runnable. You''re conspicuously avoiding answering the "what problem are you trying to solve question", which has been asked multiple times. This is no longer a DTrace question, so any further queries should get posted to the performance forum. /jim tester wrote:> Thanks everyone. A quick question, where would be the time be accounted for a thread that wait for a write/read syscall to a block device? I would only like to know the synchronous app wait from an app perspective irrespective of all the async activities that supports it. I havn''t looked at Sun studio yet since I don''t think I can install it our env. Time for me go and re-read the thread again,I guess. > > Thanks again. >
Thanks Jim.> Threads are put to sleep when they issue blocking > system calls. > I''m not sure exactly what you mean by "only know > synchronous > app wait from an app perspective....", but the > threads will be in the > sleep state (SLP) after issuing the read/write until > the data is > available, at which point a wakeup is issued and the > thread goes to > runnable.I wanted to avoid any filesystem specifc behavior from the output.> You''re conspicuously avoiding answering the "what > problem are you > trying to solve question", which has been asked > multiple times.I hoped I tried to convey that couple of times, I guess I am not clear enough. Sorry about that. The problem remains the same: profile app run time. sys, usr, various wait''s. From your reponse, it becomes clear that think I have been using the wrong term, instead of calling wait''s that time is sleep. So, a breakup of sleep time among various reasons.> This is no longer a DTrace question, so any further > queries should get > posted to the performance forum. >OK. CC''d to "Communities-performance-discuss" -- This message posted from opensolaris.org