... almost 3,000,000,000,000 of them. I''m instrumenting Sendmail (as some of you are probably bored of hearing). Right now, I''m looking at where Sendmail spends its time when it''s creating queue files. I''ve discovered that a lot of time is being spent in calls to fdsync(). The set of numbers I''m using shows 3,255,859,582,382 nanoseconds being spent in fdsync() for a given workload. So I set about instrumenting fdsync(), and counting the time consumed by the kernel calls it makes under the same workload. Those figures show 286,960,093,086 nanoseconds taken up by kernel calls. So where did the other 3,000... go? And can I get DTrace to (automatically?) show me where the time is being spent? I know I can use the lockstat provider to see how long I''m spending waiting for locks, but presumably, since those locks are being waited for by kernel functions I''ve already accounted for that time. Ditto for the io provider, right? Or do I have to instrument those too? More details, including pretty charts, at http://jc.ngo.org.uk/blog/2006/08/08/day-30-of-60-what-are-the-single-queue-directory-bottlenecks-pt-2/ N PS: Any chance D could grow a pragma to automatically insert commas in large numbers? It would make comparing results much easier.
So, If I''m reading this right, you''re profiling something that takes 3,255 seconds (54 minutes) - and you''re worried about 3 microseconds? It''s noise. Forget it. The 3 microseconds you can not account for represents .0001 percent (if I''m doing that math right) of the time window you''re profiling. There''s nothing there. Move along folks, nothing to see here. For measurements at this granularity, there''s always going to be a fudge factor of at least a few percent in terms of the accuracy of the times. My apologies if I''m reading the numbers wrong... /jim Nik Clayton wrote:> ... almost 3,000,000,000,000 of them. > > I''m instrumenting Sendmail (as some of you are probably bored of > hearing). > > Right now, I''m looking at where Sendmail spends its time when it''s > creating queue files. > > I''ve discovered that a lot of time is being spent in calls to fdsync(). > The set of numbers I''m using shows 3,255,859,582,382 nanoseconds being > spent in fdsync() for a given workload. > > So I set about instrumenting fdsync(), and counting the time consumed > by the kernel calls it makes under the same workload. > > Those figures show 286,960,093,086 nanoseconds taken up by kernel calls. > > So where did the other 3,000... go? And can I get DTrace to > (automatically?) show me where the time is being spent? I know I can > use the lockstat provider to see how long I''m spending waiting for > locks, but presumably, since those locks are being waited for by > kernel functions I''ve already accounted for that time. > > Ditto for the io provider, right? Or do I have to instrument those too? > > More details, including pretty charts, at > > http://jc.ngo.org.uk/blog/2006/08/08/day-30-of-60-what-are-the-single-queue-directory-bottlenecks-pt-2/ > > > N > > PS: Any chance D could grow a pragma to automatically insert commas > in large numbers? It would make comparing results much easier. > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
My bad - my eyes are shot. Ignore this - I''ll go back to my happy place now. /jim Jim Mauro wrote:> So, If I''m reading this right, you''re profiling something that takes > 3,255 seconds > (54 minutes) - and you''re worried about 3 microseconds? > > It''s noise. Forget it. The 3 microseconds you can not account for > represents > .0001 percent (if I''m doing that math right) of the time window you''re > profiling. There''s nothing there. Move along folks, nothing to see here. > > For measurements at this granularity, there''s always going to be a fudge > factor of at least a few percent in terms of the accuracy of the times. > > My apologies if I''m reading the numbers wrong... > > /jim > > > Nik Clayton wrote: >> ... almost 3,000,000,000,000 of them. >> >> I''m instrumenting Sendmail (as some of you are probably bored of >> hearing). >> >> Right now, I''m looking at where Sendmail spends its time when it''s >> creating queue files. >> >> I''ve discovered that a lot of time is being spent in calls to fdsync(). >> The set of numbers I''m using shows 3,255,859,582,382 nanoseconds >> being spent in fdsync() for a given workload. >> >> So I set about instrumenting fdsync(), and counting the time consumed >> by the kernel calls it makes under the same workload. >> >> Those figures show 286,960,093,086 nanoseconds taken up by kernel calls. >> >> So where did the other 3,000... go? And can I get DTrace to >> (automatically?) show me where the time is being spent? I know I can >> use the lockstat provider to see how long I''m spending waiting for >> locks, but presumably, since those locks are being waited for by >> kernel functions I''ve already accounted for that time. >> >> Ditto for the io provider, right? Or do I have to instrument those too? >> >> More details, including pretty charts, at >> >> http://jc.ngo.org.uk/blog/2006/08/08/day-30-of-60-what-are-the-single-queue-directory-bottlenecks-pt-2/ >> >> >> N >> >> PS: Any chance D could grow a pragma to automatically insert commas >> in large numbers? It would make comparing results much easier. >> _______________________________________________ >> dtrace-discuss mailing list >> dtrace-discuss at opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Hi Nick, Will one of Brendan''s tools help with this. In particular, I was thinking of hotuser, dapptrace, and dappprof. http://users.tpg.com.au/adsln4yb/dtrace.html Chip Nik Clayton wrote:> ... almost 3,000,000,000,000 of them. > > I''m instrumenting Sendmail (as some of you are probably bored of > hearing). > > Right now, I''m looking at where Sendmail spends its time when it''s > creating queue files. > > I''ve discovered that a lot of time is being spent in calls to fdsync(). > The set of numbers I''m using shows 3,255,859,582,382 nanoseconds being > spent in fdsync() for a given workload. > > So I set about instrumenting fdsync(), and counting the time consumed > by the kernel calls it makes under the same workload. > > Those figures show 286,960,093,086 nanoseconds taken up by kernel calls. > > So where did the other 3,000... go? And can I get DTrace to > (automatically?) show me where the time is being spent? I know I can > use the lockstat provider to see how long I''m spending waiting for > locks, but presumably, since those locks are being waited for by > kernel functions I''ve already accounted for that time. > > Ditto for the io provider, right? Or do I have to instrument those too? > > More details, including pretty charts, at > > http://jc.ngo.org.uk/blog/2006/08/08/day-30-of-60-what-are-the-single-queue-directory-bottlenecks-pt-2/ > > > N > > PS: Any chance D could grow a pragma to automatically insert commas > in large numbers? It would make comparing results much easier. > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Bryan Cantrill
2006-Aug-08 17:52 UTC
[dtrace-discuss] Some of my nanoseconds are missing...
On Tue, Aug 08, 2006 at 05:58:43PM +0100, Nik Clayton wrote:> ... almost 3,000,000,000,000 of them. > > I''m instrumenting Sendmail (as some of you are probably bored of hearing). > > Right now, I''m looking at where Sendmail spends its time when it''s > creating queue files. > > I''ve discovered that a lot of time is being spent in calls to fdsync(). > The set of numbers I''m using shows 3,255,859,582,382 nanoseconds being > spent in fdsync() for a given workload.Is that virtual time (vtimestamp) or wall time (timestamp)? And could you send the script that determines the delta? (A common mistake is to store the timestamp on an entry, but forget to predicate the return on the timestamp being non-zero -- potentially leading to an enormous outlier.)> So I set about instrumenting fdsync(), and counting the time consumed by > the kernel calls it makes under the same workload. > > Those figures show 286,960,093,086 nanoseconds taken up by kernel calls. > > So where did the other 3,000... go? And can I get DTrace to > (automatically?) show me where the time is being spent? I know I can > use the lockstat provider to see how long I''m spending waiting for > locks, but presumably, since those locks are being waited for by kernel > functions I''ve already accounted for that time.It depends if you''re on CPU (vtimestamp) or off (timestamp). Because fdsync() almost certainly blocks, you might want to use the sched:::off-cpu probe to figure out where you''re being taken off CPU and for how long.> PS: Any chance D could grow a pragma to automatically insert commas in > large numbers? It would make comparing results much easier.It''s actually there -- kind of. Expect mail from Mike soon... ;) - Bryan -------------------------------------------------------------------------- Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc
Michael Shapiro
2006-Aug-08 18:12 UTC
[dtrace-discuss] Some of my nanoseconds are missing...
> I''m instrumenting Sendmail (as some of you are probably bored of hearing). > ... > > PS: Any chance D could grow a pragma to automatically insert commas in > large numbers? It would make comparing results much easier.There is actually already a way to do this, but unfortunately it''s a bit blocked behind a stupid bug/oversight. printf(3S) uses the quote ('') character as a format flag that tells it to insert the locale''s thousands grouping character for %i, %d, %u, %f, %F, %g, or %G. And this is actually implemented by DTrace printf() as well. That is, this is legal D: printf("%''d\n", 123000); However, in order for the underlying printf to do anything with %'', the locale subsystem has to be initialized by a call to setlocale(). And I lazily forgot to put this into dtrace(1M), in part because my locale of choice, POSIX C, maddeningly doesn''t implement the thousands separator, and thus until that issue is fixed or worked-around, I can''t use it (!) So, if you want to fix this yourself temporarily, put this into usr/src/cmd/dtrace/dtrace.c: #include <locale.h> setlocale(LC_NUMERIC, ""); and then do something like this: LC_NUMERIC=en_US dtrace <args ...> (or just have your default locale be a non-C locale, or set LC_ALL) and it should work. We''ll get this fixed, although with the POSIX locale issue still to be resolved. If you want to see if your locale supports a grouping character you can do this: $ LC_NUMERIC=en_US locale -k thousands_sep thousands_sep="," $ LC_NUMERIC=C locale -k thousands_sep thousands_sep="" Since you''re in the UK, I''m sure you''ve got the appropriate character set as Her Majesty intended it to be :) -Mike -- Mike Shapiro, Solaris Kernel Development. blogs.sun.com/mws/
Bryan Cantrill wrote:> On Tue, Aug 08, 2006 at 05:58:43PM +0100, Nik Clayton wrote: >> ... almost 3,000,000,000,000 of them. >> >> I''m instrumenting Sendmail (as some of you are probably bored of hearing). >> >> Right now, I''m looking at where Sendmail spends its time when it''s >> creating queue files. >> >> I''ve discovered that a lot of time is being spent in calls to fdsync(). >> The set of numbers I''m using shows 3,255,859,582,382 nanoseconds being >> spent in fdsync() for a given workload. > > Is that virtual time (vtimestamp) or wall time (timestamp)?''timestamp''.> And could you send the script that determines the delta?http://jc.ngo.org.uk:80/svnweb/jc/view/nik/dtrace4sendmail/trunk/dtrace-scripts/queue-create-fbt.d > (A common mistake is to store the timestamp on an entry, but forget to > predicate the return on the timestamp being non-zero -- potentially > leading to an enormous outlier.) Oh, that''s worth knowing. The documentation doesn''t make that clear. uint64_t timestamp The current value of a nanosecond timestamp counter. This counter increments from an arbitrary point in the past and should only be used for relative computations. "arbitrary point in the past" definitely implies that it would never be zero. I might be suffering from that bug then.>> So I set about instrumenting fdsync(), and counting the time consumed by >> the kernel calls it makes under the same workload. >> >> Those figures show 286,960,093,086 nanoseconds taken up by kernel calls. >> >> So where did the other 3,000... go? And can I get DTrace to >> (automatically?) show me where the time is being spent? I know I can >> use the lockstat provider to see how long I''m spending waiting for >> locks, but presumably, since those locks are being waited for by kernel >> functions I''ve already accounted for that time. > > It depends if you''re on CPU (vtimestamp) or off (timestamp). Because > fdsync() almost certainly blocks, you might want to use the sched:::off-cpu > probe to figure out where you''re being taken off CPU and for how long.Right. But isn''t fdsync() going to block somewhere inside a kernel function? So shouldn''t that show up as time consumed by that function? Or can it (and more generally, any syscall) block outside of a kernel function?>> PS: Any chance D could grow a pragma to automatically insert commas in >> large numbers? It would make comparing results much easier. > > It''s actually there -- kind of. Expect mail from Mike soon... ;)That''s uncanny :-) N
Nik Clayton wrote:>> And could you send the script that determines the delta? > > http://jc.ngo.org.uk:80/svnweb/jc/view/nik/dtrace4sendmail/trunk/dtrace-scripts/queue-create-fbt.dI meant to say that the raw results are available too: This is the command line, and results from instrumenting kernel function calls triggered by a specific system call. # queue-create-fbt.d relay-zone fdsync (1 queue directory, 1 sender) http://jc.ngo.org.uk/svnweb/jc/browse/nik/dtrace4sendmail/trunk/results/test12/ # queue-create-fbt.d relay-zone open (1 queue directory, 1 sender) http://jc.ngo.org.uk/svnweb/jc/browse/nik/dtrace4sendmail/trunk/results/test13/ # queue-create-fbt.d relay-zone fdsync (1 queue directory, 10 senders) http://jc.ngo.org.uk/svnweb/jc/browse/nik/dtrace4sendmail/trunk/results/test14/ # queue-create-fbt.d relay-zone open (1 queue directory, 10 senders) http://jc.ngo.org.uk/svnweb/jc/browse/nik/dtrace4sendmail/trunk/results/test15/ This is the command line, and results, from instrumenting the time taken by each system call: # queue-create-syscall.d relay-zone (1 queue directory, 1 sender) http://jc.ngo.org.uk/svnweb/jc/browse/nik/dtrace4sendmail/trunk/results/test9/ # queue-create-syscall.d relay-zone (1 queue directory, 10 senders) http://jc.ngo.org.uk/svnweb/jc/browse/nik/dtrace4sendmail/trunk/results/test10/ # quque-create-syscall.d relay-zone (10 queue directories, 10 senders) http://jc.ngo.org.uk/svnweb/jc/browse/nik/dtrace4sendmail/trunk/results/test11/> > (A common mistake is to store the timestamp on an entry, but forget to > > predicate the return on the timestamp being non-zero -- potentially > > leading to an enormous outlier.) > > Oh, that''s worth knowing. The documentation doesn''t make that clear. > > uint64_t timestamp The current value of a nanosecond timestamp > counter. This counter increments from an > arbitrary point in the past and should only be > used for relative computations. > > "arbitrary point in the past" definitely implies that it would never be > zero.Ignore that. Clearly I''ve not had enough caffeine today. N