hi, [u]This day when I run a simple dtrace script as following:[/u] [root at David /export/home/tyc/DTraceTest]#cat pidCall.d #!/usr/sbin/dtrace -qs pid$target:::entry { self->depth++; printf("%*s->%s`%s\n", self->depth, " ", probemod, probefunc); } pid$target:::return { printf("%*s<-%s`%s\n", self->depth, " ", probemod, probefunc); self->depth--; } [u]but the result was like this:[/u] [root at David /export/home/tyc/DTraceTest]#./pidCall.d -c date | wc dtrace: 1832 drops on CPU 0 5356 5358 158357 [root at David /export/home/tyc/DTraceTest]#./pidCall.d -c date | wc dtrace: 1835 drops on CPU 1 5353 5355 158251 [root at David /export/home/tyc/DTraceTest]#./pidCall.d -c date | wc 7188 7190 220818 [root at David /export/home/tyc/DTraceTest]#./pidCall.d -c date | wc dtrace: 1834 drops on CPU 1 5354 5356 158280 [u]Why drops came out sometimes?[/u] Thanks! This message posted from opensolaris.org
Yongce Tu wrote:> hi, > > [u]This day when I run a simple dtrace script as following:[/u] > [root at David /export/home/tyc/DTraceTest]#cat pidCall.d > #!/usr/sbin/dtrace -qs > > pid$target:::entry > { > self->depth++; > printf("%*s->%s`%s\n", self->depth, " ", probemod, probefunc); > } > > pid$target:::return > { > printf("%*s<-%s`%s\n", self->depth, " ", probemod, probefunc); > self->depth--; > } > > [u]but the result was like this:[/u] > [root at David /export/home/tyc/DTraceTest]#./pidCall.d -c date | wc > dtrace: 1832 drops on CPU 0 > 5356 5358 158357 > [root at David /export/home/tyc/DTraceTest]#./pidCall.d -c date | wc > dtrace: 1835 drops on CPU 1 > 5353 5355 158251 > [root at David /export/home/tyc/DTraceTest]#./pidCall.d -c date | wc > 7188 7190 220818 > [root at David /export/home/tyc/DTraceTest]#./pidCall.d -c date | wc > dtrace: 1834 drops on CPU 1 > 5354 5356 158280 > > [u]Why drops came out sometimes?[/u]A "drop" is a "I can''t allocate space for this right now" response from DTrace. See the chapters on Buffers and Buffering in the guide. You''re asking for a very large amount of trace info in a very short time.> > Thanks!
Hi Yongce, i suspect it is because the buffer that dtrace is tracing to in the kernel is full. when enabled probes contain printf() statements, the strings to be traced out is written into an active buffer in the kernel. when a consumer attempts to read out the data, the dtrace kernel switch to an alternate buffer, and copies out the original buffer contents to the producer. i suspect that the consumer is not consuming data fast enough, and hence new trace data is filling up the current active buffer resulting in some trace data being dropped. i had a look at your script, and given what i think you''re doing, it is cleaner to just define a script with empty clauses: [unknown]# cat pid.d #!/usr/sbin/dtrace -s pid$target:::entry { } pid$target:::return { } and let dtrace handle the call sequence tracing that you want by: [unknown]# ./pid.d -F -c date | more dtrace: script ''./pid.d'' matched 5832 probes Wed May 17 11:42:04 SGT 2006 dtrace: pid 103117 has exited CPU FUNCTION 0 <- rtld_db_preinit 0 <- rd_event 0 -> call_array 0 <- call_array 0 -> call_init 0 -> leave 0 -> fmap_setup 0 -> munmap 0 <- munmap 0 <- fmap_setup 0 -> rt_bind_clear 0 <- rt_bind_clear 0 -> _rt_bind_clear 0 <- _rt_bind_clear 0 -> rt_mutex_unlock 0 <- rt_mutex_unlock 0 -> _rt_null 0 <- _rt_null 0 -> rt_bind_clear 0 <- rt_bind_clear 0 -> _rt_bind_clear 0 <- _rt_bind_clear 0 <- leave 0 -> libc_init ... ... incidentally, self->depth in pidCall.d, is not properly maintained. if you print out the value, you''d see it runs into negative values: pid$target:::entry { self->depth++; printf("+%*s+\n", self->depth, " "); printf(".%d. %*s->%s`%s\n", self->depth, self->depth, " ", probemod, probefunc); } ... ... Wed May 17 11:25:56 SGT 2006 CPU ID FUNCTION:NAME 0 40203 rtld_db_preinit:return + + .0. <-LM1`ld.so.1`rtld_db_preinit 0 40205 rd_event:return + + .-1. <-LM1`ld.so.1`rd_event 0 37285 call_array:entry + + .-1. ->LM1`ld.so.1`call_array 0 40214 call_array:return + + .-1. <-LM1`ld.so.1`call_array 0 37286 call_init:entry + + .-1. ->LM1`ld.so.1`call_init ... ... cheers, hth, sam Yongce Tu wrote:> hi, > > [u]This day when I run a simple dtrace script as following:[/u] > [root at David /export/home/tyc/DTraceTest]#cat pidCall.d > #!/usr/sbin/dtrace -qs > > pid$target:::entry > { > self->depth++; > printf("%*s->%s`%s\n", self->depth, " ", probemod, probefunc); > } > > pid$target:::return > { > printf("%*s<-%s`%s\n", self->depth, " ", probemod, probefunc); > self->depth--; > } > > [u]but the result was like this:[/u] > [root at David /export/home/tyc/DTraceTest]#./pidCall.d -c date | wc > dtrace: 1832 drops on CPU 0 > 5356 5358 158357 > [root at David /export/home/tyc/DTraceTest]#./pidCall.d -c date | wc > dtrace: 1835 drops on CPU 1 > 5353 5355 158251 > [root at David /export/home/tyc/DTraceTest]#./pidCall.d -c date | wc > 7188 7190 220818 > [root at David /export/home/tyc/DTraceTest]#./pidCall.d -c date | wc > dtrace: 1834 drops on CPU 1 > 5354 5356 158280 > > [u]Why drops came out sometimes?[/u] > > Thanks! > > > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
> the consumer is not consuming data fast enoughwhat''s mean? Who is the consumer? Why there were no drops when i used more cmd rather than wc cmd?(wc cmd consumed data faster than more cmd, it''s right?) it puzzled me. how to avoid these drops? i have to reset the comsumer option? Or, there are some other ways?> incidentally, self->depth in pidCall.d, is not properly maintained. > if you print out the value, you''d see it runs into negative values:yes, i had neglected it . Regards, Yongce This message posted from opensolaris.org
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html;charset=UTF-8" http-equiv="Content-Type"> </head> <body bgcolor="#ffffff" text="#000000"> Yongce Tu wrote: <blockquote cite="mid30734174.1147917075958.JavaMail.suncom@oss1" type="cite"> <blockquote type="cite"> <pre wrap="">the consumer is not consuming data fast enough </pre> </blockquote> <pre wrap=""><!---->what''s mean? Who is the consumer? Why there were no drops when i used more cmd rather than wc cmd?(wc cmd consumed data faster than more cmd, it''s right?) it puzzled me. how to avoid these drops? i have to reset the comsumer option? Or, there are some other ways? </pre> </blockquote> The consumer is typically the dtrace command but could also be plockstat and<br> other such commands.<br> <br> The more command reads data more quickly than the wc command, thus<br> the dtrace consumer is able to push data through faster.<br> <br> There is a chapter in the dtrace manual which describes buffering. You will find<br> tuning information there.<br> <br> Diane<br> kernel engineer<br> <br> </body> </html>
On Wed, May 17, 2006 at 06:37:45PM -0700, Yongce Tu wrote:> > the consumer is not consuming data fast enough > what''s mean?The dtrace(1M) command not downloading the data into userland fast enough to keep up with the data rate the probes are generating.> Who is the consumer? Why there were no drops when i used more cmd > rather than wc cmd?(wc cmd consumed data faster than more cmd, it''s > right?) it puzzled me. how to avoid these drops? i have to reset the > comsumer option? Or, there are some other ways?You need to either: 1. Decrease the amount of data you are collecting (i.e. lower the ''strsize'' option, or stop recording unneeded strings) 2. Increase the "switchrate" option, which determines how often dtrace(1M) dowloads data from the kernel. It defaults to 1hz (once per second); increasing it to 5hz or 10hz might help. 3. Increase the "bufsize" option (defaults to 4megabytes) Your example appears to be most amenable to #1; as Sang-Suan Gam pointed out, you''re tracing data that dtrace(1) already knows from context. If you really want to generate your own data, you should try decreasing the ''strsize'' option; it defaults to 256, and your function names are likely to be much shorter. As a calculation, your data records for: printf("%*s<-%s`%s\n", self->depth, " ", probemod, probefunc); will be two 32-bit integers (the first identifies the firing, the second is the self->depth data) and three strings. That comes to: 2*4 + 3*256 = 776 bytes; 4 megabytes / 776 bytes = 5405 records. So if you generate more than 5,400 records per second, you''ll have drops. Lowering strsize to 64 would yield: 2*4 + 3*64 = 200 bytes; 4 megabytes / 200 bytes = 20971 records. Getting it down to no records (like Sang-Suan''s example) means that your records are 4 bytes a piece, allowing a million events per second. Cheers, - jonathan -- Jonathan Adams, Solaris Kernel Development