Hello, I have a customer who has some dtrace questions. I am guessing that someone knows the answer to these, so I am asking here. Here are the questions: ********** In this document, we will describe how we assume that DTrace uses its memory. Most assumptions result from [1]. We want these assumptions to be validated by a DTrace expert from Oracle. This validation is necessary to provide us confidence that DTrace can execute for a long period of time (in the order of weeks) along with our software, without introducing errors due to e.g. "dynamic variable drops". In addition, we described a problem we experience with our DTrace script, for which we want to have support from you. [1] Sun Microsystems inc, "Solaris Dynamic Tracing Guide", September 2008. Quotes from Solaris Dynamic Tracing Guide [1], with interpretation: . "Each time the variable self->read is referenced in your D program, the data object referenced is the one associated with the operating system thread that was executing when the corresponding DTrace probe fired." o Interpretation: Per CPU there is a dispatcher that has its own thread, when it executes the sched:::on-cpu and sched:::off probes. . "At that time, the ring buffer is consumed and processed. dtrace processes each ring buffer in CPU order. Within a CPU''s buffer, trace records will be displayed in order from oldest to youngest." Interpretation: There is a principal buffer per CPU 1) Impact on Business We have a number of assumptions that we would like to verify about DTrace. 2) What is the OS version and the kernel patch level of the system? SunOS nlvdhe321 5.10 Generic_141444-09 sun4v sparc SUNW,T5240 3) What is the Firmware level of the system? SP firmware 3.0.10.2.b SP firmware build number: 56134 SP firmware date: Tue May 25 13:02:56 PDT 2010 SP filesystem version: 0.1.22 ********** Thanks Scott Shurr| Solaris and Network Domain, Global Systems Support Email: Scott.Shurr at oracle.com <mailto:Scott.Shurr at oracle.com> Phone: 781-442-1352 Oracle Global Customer Services Log, update, and monitor your Service Request online using My Oracle Support <http://www.oracle.com/us/support/044752.html> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20110701/dc9c97ea/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: oracle.jpg Type: image/jpeg Size: 8717 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20110701/dc9c97ea/attachment.jpg>
I''m not sure I understand what is being asked here, but I''ll take a shot... Note it is virtually impossible to write a piece of software that is guaranteed to have sufficient space to buffer a given amount of data when the rate and size of the data flow is unknown. This is one of the robustness features of dtrace - it''s smart enough to know that, and smart enough to let the user know when data can not be buffered. Yes, buffers are allocated per-CPU. There are several buffer types, depending on the dtrace invocation. Minimally, principle buffers are allocated per CPU when a dtrace consumer (dtrace(1M)) is executed. Read; http://wikis.sun.com/display/DTrace/Buffers+and+Buffering The "self->read" describes a thread local variable, one of several variable types available in DTrace. It defines the variable scope - each kernel thread that''s on the CPU when the probe(s) fires will have it''s own copy of a "self->" variable. There is only one kernel dispatcher, not one per CPU. There are per-CPU run queues managed by the dispatcher. As for running a DTrace script for hours/days/weeks, I have never been down that road. It is theoretically possible of course, and seems to be a good use of speculative buffers or a ring buffer policy. We can not guarantee it will execute without errors ("dynamic variable drops", etc). We can guarantee you''ll know when errors occur. How can such guarantees be made with a dynamic tool like dtrace? Does your customer know up-front how much data will be traced/processed/ consumed, and at what rate? Read this; http://blogs.oracle.com/bmc/resource/dtrace_tips.pdf Thanks /jim On Jul 1, 2011, at 9:30 AM, Scott Shurr wrote:> Hello, > I have a customer who has some dtrace questions. I am guessing that someone knows the answer to these, so I am asking here. Here are the questions: > > ********** > In this document, we will describe how we assume that DTrace uses its memory. Most assumptions result from [1]. We want these assumptions to be validated by a DTrace expert from Oracle. This validation is necessary to provide us confidence that DTrace can execute for a long period of time (in the order of weeks) along with our software, without introducing errors due to e.g. ?dynamic variable drops?. In addition, we described a problem we experience with our DTrace script, for which we want to have support from you. > > [1] Sun Microsystems inc, ?Solaris Dynamic Tracing Guide?, September 2008. > Quotes from Solaris Dynamic Tracing Guide [1], with interpretation: > ? ?Each time the variable self->read is referenced in your D program, the data object referenced is the one associated with the operating system thread that was executing when the corresponding DTrace probe fired.? > o Interpretation: Per CPU there is a dispatcher that has its own thread, when it executes the sched:::on-cpu and sched:::off probes. > ? ?At that time, the ring buffer is consumed and processed. dtrace processes each ring buffer in CPU order. Within a CPU''s buffer, trace records will be displayed in order from oldest to youngest.? > Interpretation: There is a principal buffer per CPU > > 1) Impact on Business > We have a number of assumptions that we would like to verify about DTrace. > > 2) What is the OS version and the kernel patch level of the system? > SunOS nlvdhe321 5.10 Generic_141444-09 sun4v sparc SUNW,T5240 > > 3) What is the Firmware level of the system? > SP firmware 3.0.10.2.b > SP firmware build number: 56134 > SP firmware date: Tue May 25 13:02:56 PDT 2010 > SP filesystem version: 0.1.22 > ********** > Thanks > <oracle.jpg> > > Scott Shurr| Solaris and Network Domain, Global Systems Support > Email: Scott.Shurr at oracle.com > Phone: 781-442-1352 > Oracle Global Customer Services > > Log, update, and monitor your Service Request online using My Oracle Support > > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20110701/712eef47/attachment-0001.html>
On Jul 1, 2011 9:32 AM, "Scott Shurr" <scott.shurr at oracle.com> wrote:> In this document, we will describe how we assume that DTrace uses itsmemory. Most assumptions result from [1]. We want these assumptions to be validated by a DTrace expert from Oracle. This validation is necessary to provide us confidence that DTrace can execute for a long period of time (in the order of weeks) along with our software, without introducing errors due to e.g. ?dynamic variable drops?. In addition, we described a problem we experience with our DTrace script, for which we want to have support from you. DTrace is designed to be reliable in one sense only: it won''t consume so many resources as to kill production performance. This means that DTrace cannot be reliable in the sense of not dropping probe data. This is inescapable. Nico -- -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20110701/0c2473d5/attachment.html>
My customer still thinks this is a bug: ********** We think that DTrace is not working as designed. Therefore we wanted to report a bug for DTrace. Thank you for the answers from mailing list, but these do not solve our problem. The DTrace script that we describe in the document results in "dynamic variable drops", if the system is heavily loaded for a longer period of time (around 24 hours). We assume that this is not the intended behavior of DTrace. We use a DTrace script with thread local variables (self->) and still get "dynamic variable drops". DTrace has to be used for a longer period of time in our systems, where the utilization of the system can be high. In case of "dynamic variable drops" incorrect tracing is performed. As described in the previous document we think our script is working correctly and therefore we think that DTrace is working incorrectly. We would like a solution for this. The previous document contains some assumptions that we had to make about DTrace, but which we could verify using the DTrace documentation. In our script we use thread local variables (self->). We assume that these variables result in at most one variable per hardware thread(which is a part of a CPU). Because the thread local variable at a CPU can be reused in consecutive calls of the probes and we use no other variables, we assume that our script is not causing a full dynamic variable space. But, DTrace appears to be doing something else that causes the dynamic variable space to get full (for a loaded system after 24 hours), because we still get "dynamic variable drops". This makes the DTrace solution unreliable. We would like to see a solution for this problem. I attached the version of the DTrace script that we use. For its configuration, this script depends upon a process for its configuration. After running this script for approximately 24 hours on a heavily loaded machine, we get dynamic variable drops. ********** It is my belief that this is not a bug, but I need something more to give the customer to convince him of this. I''ve attached his script CSET.d Thanks Scott Shurr| Solaris and Network Domain, Global Systems Support Email: Scott.Shurr at oracle.com <mailto:Scott.Shurr at oracle.com> Phone: 781-442-1352 Oracle Global Customer Services Log, update, and monitor your Service Request online using My Oracle Support <http://www.oracle.com/us/support/044752.html> On 07/01/11 10:58, Jim Mauro wrote:> I''m not sure I understand what is being asked here, but I''ll take a > shot... > > Note it is virtually impossible to write a piece of software that is > guaranteed > to have sufficient space to buffer a given amount of data when the rate > and size of the data flow is unknown. This is one of the robustness > features > of dtrace - it''s smart enough to know that, and smart enough to let > the user > know when data can not be buffered. > > Yes, buffers are allocated per-CPU. There are several buffer types, > depending > on the dtrace invocation. Minimally, principle buffers are allocated > per CPU > when a dtrace consumer (dtrace(1M)) is executed. Read; > http://wikis.sun.com/display/DTrace/Buffers+and+Buffering > > The "self->read" describes a thread local variable, one of several > variable > types available in DTrace. It defines the variable scope - each kernel > thread > that''s on the CPU when the probe(s) fires will have it''s own copy of a > "self->" variable. > > There is only one kernel dispatcher, not one per CPU. There are > per-CPU run > queues managed by the dispatcher. > > As for running a DTrace script for hours/days/weeks, I have never been > down that > road. It is theoretically possible of course, and seems to be a good > use of > speculative buffers or a ring buffer policy. > > We can not guarantee it will execute without errors ("dynamic variable > drops", etc). > We can guarantee you''ll know when errors occur. > > How can such guarantees be made with a dynamic tool like dtrace? > Does your customer know up-front how much data will be traced/processed/ > consumed, and at what rate? > > Read this; > http://blogs.oracle.com/bmc/resource/dtrace_tips.pdf > > Thanks > /jim > > On Jul 1, 2011, at 9:30 AM, Scott Shurr wrote: > >> Hello, >> I have a customer who has some dtrace questions. I am guessing that >> someone knows the answer to these, so I am asking here. Here are the >> questions: >> >> ********** >> In this document, we will describe how we assume that DTrace uses its >> memory. Most assumptions result from [1]. We want these assumptions >> to be validated by a DTrace expert from Oracle. This validation is >> necessary to provide us confidence that DTrace can execute for a long >> period of time (in the order of weeks) along with our software, >> without introducing errors due to e.g. ?dynamic variable drops?. In >> addition, we described a problem we experience with our DTrace >> script, for which we want to have support from you. >> >> [1] Sun Microsystems inc, ?Solaris Dynamic Tracing Guide?, September >> 2008. >> Quotes from Solaris Dynamic Tracing Guide [1], with interpretation: >> ? ?Each time the variable self->read is referenced in your D >> program, the data object referenced is the one associated with the >> operating system thread that was executing when the corresponding >> DTrace probe fired.? >> o Interpretation: Per CPU there is a dispatcher that has its own >> thread, when it executes the sched:::on-cpu and sched:::off probes. >> ? ?At that time, the ring buffer is consumed and processed. dtrace >> processes each ring buffer in CPU order. Within a CPU''s buffer, trace >> records will be displayed in order from oldest to youngest.? >> Interpretation: There is a principal buffer per CPU >> >> 1) Impact on Business >> We have a number of assumptions that we would like to verify about >> DTrace. >> >> 2) What is the OS version and the kernel patch level of the system? >> SunOS nlvdhe321 5.10 Generic_141444-09 sun4v sparc SUNW,T5240 >> >> 3) What is the Firmware level of the system? >> SP firmware 3.0.10.2.b >> SP firmware build number: 56134 >> SP firmware date: Tue May 25 13:02:56 PDT 2010 >> SP filesystem version: 0.1.22 >> ********** >> Thanks >> >> <oracle.jpg> >> >> Scott Shurr| Solaris and Network Domain, Global Systems Support >> Email: Scott.Shurr at oracle.com <mailto:Scott.Shurr at oracle.com> >> Phone: 781-442-1352 >> Oracle Global Customer Services >> >> Log, update, and monitor your Service Request online using My Oracle >> Support <http://www.oracle.com/us/support/044752.html> >> >> >> _______________________________________________ >> dtrace-discuss mailing list >> dtrace-discuss at opensolaris.org <mailto:dtrace-discuss at opensolaris.org> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20110711/30ac0744/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: oracle.jpg Type: image/jpeg Size: 8717 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20110711/30ac0744/attachment-0001.jpg> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: CSET.d URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20110711/30ac0744/attachment-0001.ksh>
Your customer is wrong. There is no way for DTrace (or any other piece of software) to guarantee sufficient buffer space all the time, every time. You/we can not make any guarantees to the customer that a given DTrace script will never run out of space for dynamic variables (thread-local variables, associative array variables). There are things that can be done to minimize the possibility of dynamic variable drops; 1. Clear unused dynamic variables in the D by assigning a value of zero. 2. Increase the space allocated for dynamic variables. Default is 1MB. #pragma D option dynvarsize=2m Send you customer a link to the slides posted previously. Buy your customer a copy of "DTrace: Dynamic Tracing in Oracle Soiaris, Mac OS X, and FreeBSD". Actually, but them two copies. Please note also that among thousands (tens of thousands?) of DTrace users using DTrace every day, we see very few complaints such as this. In fact, with the exception of the early days of DTrace, we never see them anymore, because we understand the error and we work around it. If your customer is unwilling or unable to work within the constraints of this powerful tool, there''s nothing anyone can do about that. We can''t change the laws of the universe because your customer doesn''t understand the complexities of dynamic instrumentation. Thanks /jim On Jul 11, 2011, at 11:20 AM, Scott Shurr wrote:> My customer still thinks this is a bug: > > ********** > We think that DTrace is not working as designed. Therefore we wanted to report a bug for DTrace. Thank you for the answers from mailing list, but these do not solve our problem. > > The DTrace script that we describe in the document results in "dynamic variable drops", if the system is heavily loaded for a longer period of time (around 24 hours). We assume that this is not the intended behavior of DTrace. > > We use a DTrace script with thread local variables (self->) and still get "dynamic variable drops". DTrace has to be used for a longer period of time in our systems, where the utilization of the system can be high. In case of "dynamic variable drops" incorrect tracing is performed. As described in the previous document we think our script is working correctly and therefore we think that DTrace is working incorrectly. We would like a solution for this. > > The previous document contains some assumptions that we had to make about DTrace, but which we could verify using the DTrace documentation. In our script we use thread local variables (self->). We assume that these variables result in at most one variable per hardware thread(which is a part of a CPU). Because the thread local variable at a CPU can be reused in consecutive calls of the probes and we use no other variables, we assume that our script is not causing a full dynamic variable space. But, DTrace appears to be doing something else that causes the dynamic variable space to get full (for a loaded system after 24 hours), because we still get "dynamic variable drops". This makes the DTrace solution unreliable. We would like to see a solution for this problem. > > I attached the version of the DTrace script that we use. For its configuration, this script depends upon a process for its configuration. After running this script for approximately 24 hours on a heavily loaded machine, we get dynamic variable drops. > ********** > > It is my belief that this is not a bug, but I need something more to give the customer to convince him of this. I''ve attached his script CSET.d > Thanks > <oracle.jpg> > > Scott Shurr| Solaris and Network Domain, Global Systems Support > Email: Scott.Shurr at oracle.com > Phone: 781-442-1352 > Oracle Global Customer Services > > Log, update, and monitor your Service Request online using My Oracle Support > > > > > On 07/01/11 10:58, Jim Mauro wrote: >> >> I''m not sure I understand what is being asked here, but I''ll take a shot... >> >> Note it is virtually impossible to write a piece of software that is guaranteed >> to have sufficient space to buffer a given amount of data when the rate >> and size of the data flow is unknown. This is one of the robustness features >> of dtrace - it''s smart enough to know that, and smart enough to let the user >> know when data can not be buffered. >> >> Yes, buffers are allocated per-CPU. There are several buffer types, depending >> on the dtrace invocation. Minimally, principle buffers are allocated per CPU >> when a dtrace consumer (dtrace(1M)) is executed. Read; >> http://wikis.sun.com/display/DTrace/Buffers+and+Buffering >> >> The "self->read" describes a thread local variable, one of several variable >> types available in DTrace. It defines the variable scope - each kernel thread >> that''s on the CPU when the probe(s) fires will have it''s own copy of a >> "self->" variable. >> >> There is only one kernel dispatcher, not one per CPU. There are per-CPU run >> queues managed by the dispatcher. >> >> As for running a DTrace script for hours/days/weeks, I have never been down that >> road. It is theoretically possible of course, and seems to be a good use of >> speculative buffers or a ring buffer policy. >> >> We can not guarantee it will execute without errors ("dynamic variable drops", etc). >> We can guarantee you''ll know when errors occur. >> >> How can such guarantees be made with a dynamic tool like dtrace? >> Does your customer know up-front how much data will be traced/processed/ >> consumed, and at what rate? >> >> Read this; >> http://blogs.oracle.com/bmc/resource/dtrace_tips.pdf >> >> Thanks >> /jim >> >> On Jul 1, 2011, at 9:30 AM, Scott Shurr wrote: >> >>> Hello, >>> I have a customer who has some dtrace questions. I am guessing that someone knows the answer to these, so I am asking here. Here are the questions: >>> >>> ********** >>> In this document, we will describe how we assume that DTrace uses its memory. Most assumptions result from [1]. We want these assumptions to be validated by a DTrace expert from Oracle. This validation is necessary to provide us confidence that DTrace can execute for a long period of time (in the order of weeks) along with our software, without introducing errors due to e.g. ?dynamic variable drops?. In addition, we described a problem we experience with our DTrace script, for which we want to have support from you. >>> >>> [1] Sun Microsystems inc, ?Solaris Dynamic Tracing Guide?, September 2008. >>> Quotes from Solaris Dynamic Tracing Guide [1], with interpretation: >>> ? ?Each time the variable self->read is referenced in your D program, the data object referenced is the one associated with the operating system thread that was executing when the corresponding DTrace probe fired.? >>> o Interpretation: Per CPU there is a dispatcher that has its own thread, when it executes the sched:::on-cpu and sched:::off probes. >>> ? ?At that time, the ring buffer is consumed and processed. dtrace processes each ring buffer in CPU order. Within a CPU''s buffer, trace records will be displayed in order from oldest to youngest.? >>> Interpretation: There is a principal buffer per CPU >>> >>> 1) Impact on Business >>> We have a number of assumptions that we would like to verify about DTrace. >>> >>> 2) What is the OS version and the kernel patch level of the system? >>> SunOS nlvdhe321 5.10 Generic_141444-09 sun4v sparc SUNW,T5240 >>> >>> 3) What is the Firmware level of the system? >>> SP firmware 3.0.10.2.b >>> SP firmware build number: 56134 >>> SP firmware date: Tue May 25 13:02:56 PDT 2010 >>> SP filesystem version: 0.1.22 >>> ********** >>> Thanks >>> <oracle.jpg> >>> >>> Scott Shurr| Solaris and Network Domain, Global Systems Support >>> Email: Scott.Shurr at oracle.com >>> Phone: 781-442-1352 >>> Oracle Global Customer Services >>> >>> Log, update, and monitor your Service Request online using My Oracle Support >>> >>> >>> _______________________________________________ >>> dtrace-discuss mailing list >>> dtrace-discuss at opensolaris.org >> > #!/usr/sbin/dtrace -s > > /*----------------------------------------------------------------------------| > | | > | DTrace script | > | | > |-----------------------------------------------------------------------------| > | > | Ident : CSET.d > | Description : This script traces thread executions > | > | History > | yyyy-mm-dd : <swchg> <author> [<platform> <release>] > | 2011-03-16 : Initial creation tbijlsma > | 2011-03-16 : <SWCHG00371518> <tbijlsma> > | > |-----------------------------------------------------------------------------| > | | > | Copyright (c) 2011, ASML Holding N.V. (including affiliates). | > | All rights reserved | > | | > |----------------------------------------------------------------------------*/ > > /* Do not provide additional output*/ > #pragma D option quiet > > /* Store the printf values in a circular buffer*/ > #pragma D option bufpolicy=ring > > /* Necessary to signal CSET.c */ > #pragma D option destructive > > uint64_t Stimercorrection; /* Offset set by the pid$1::configDtrace:entry probe*/ > int *setOne; /* Pointer to an int with value one */ > > /* Initially sets Stimercorrection to zero > */ > BEGIN > { > Stimercorrection = 0; > } > > /* This probe sets the moment that CSET calls the function "configDtrace" as > * time 0 and signals the function, by setting the value of arg0 to 1 > */ > pid$1::configDtrace:entry > { > setOne = alloca(sizeof(int)); > *setOne = 1; > Stimercorrection = timestamp; > copyout(setOne,arg0,sizeof(int)); > } > > /* When a SW thread starts executing, we store the start time */ > sched:::on-cpu > /execname!="sched" && Stimercorrection/ > { > self->startTime = timestamp - Stimercorrection; > } > > /* When a SW thread stops executing at a processor, print information for the > * event in the ring buffer > */ > sched:::off-cpu > /self->startTime && execname!="sched" && Stimercorrection/ > { > printf("%d|%d|%i|%i|%i|%s\n", > self->startTime, /* start time*/ > timestamp - Stimercorrection, > cpu, /*number of CPU*/ > tid, /*current thread*/ > pid, /*process id*/ > execname /*process name*/ > ); > self->startTime = 0; > } > > /* If the main task calls the function retrieveTrace, we stop the DTrace script > * and the values in the ring buffer will be printed > */ > pid$1::retrieveTrace:entry > { > exit(0); > } > > /* Stop the DTrace script if kill -9 has been called for CSET > * arg0 contains PID of killed process and arg1 the signal > */ > syscall::kill:entry > /arg0==$1 && ( arg1==9 || arg1==15 )/ > { > exit(0); > } > > > /* Finally we print "EOT" to mark the end of a trace > */ > END > { > printf("EOT\n"); > } > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20110711/00b52480/attachment-0001.html>
On Mon, Jul 11, 2011 at 10:20 AM, Scott Shurr <scott.shurr at oracle.com> wrote:> > My customer still thinks this is a bug: > > ********** > We think that DTrace is not working as designed. Therefore we wanted to report a bug for DTrace. Thank you for the answers from mailing list, but these do not solve our problem. > > The DTrace script that we describe in the document results in "dynamic variable drops", if the system is heavily loaded for a longer period of time (around 24 hours). We assume that this is not the intended behavior of DTrace.This assumption is wrong. It is a principal tenet of DTrace''s design that it will not critically interfere with production services. This means, for example, that DTrace being starved for resources will not result in a system hanging until resources are freed. This is also the reason for DTrace''s lack of looping facilities in its D language (and byte compiled instruction set). Nico --
On Jul 11, 2011, at 8:36 AM, dtrace-discuss-request at opensolaris.org wrote:> From: Jim Mauro <james.mauro at oracle.com>> You/we can not make any guarantees to the customer that a given DTrace script > will never run out of space for dynamic variables (thread-local variables, > associative array variables). There are things that can be done to minimize > the possibility of dynamic variable drops; > 1. Clear unused dynamic variables in the D by assigning a value of zero. > 2. Increase the space allocated for dynamic variables. Default is 1MB. > #pragma D option dynvarsize=2mJim, are you saying that if a script has a small upper bound on the number of active dynamic variables, DTrace may still run out of dynamic variable space? Because that certainly is, if not a bug, very surprising behavior [note: I wrote this before I found the undocumented information about cleanrate mentioned below]. The script provided looks entirely reasonable. It uses one thread-local variable which is set in sched:::on-cpu and cleared in sched:::off-cpu. Honestly, the only way I can imagine this growing beyond the number of hardware contexts is if there is a problem with the assumption that every thread that fires on-cpu also fires off-cpu. And if that''s the case, I would indeed call that a bug. The script doesn''t do anything that isn''t in several examples in the DTrace documentation. There''s no reason I can see that it should consume dynamic variable space if everything is working correctly. The user''s expectations are reasonable to an experienced DTrace user. Scott, your customer''s message did not include the exact complaint from dtrace. In particular, it didn''t say whether the dynamic variable drops warning included "with non-empty dirty list". If it did, then the problem may be that heavy load results in so many threads being scheduled in a short time that the asynchronous dynamic variable reaper can''t keep up with all the thread-local variables. Increasing the frequency of the cleanrate tunable may help with this problem (as described in slide 23 <http://blogs.oracle.com/bmc/resource/dtrace_tips.pdf>). This information is (still) not documented outside of a slide deck and the not-free book, so you may want to add an entry to CR 6221632. joe
On Mon, Jul 11, 2011 at 1:28 PM, Joe Pallas <pallas at cs.stanford.edu> wrote:> The script provided looks entirely reasonable. ?It uses one thread-local variable which is set in sched:::on-cpu and cleared in sched:::off-cpu. ?Honestly, the only way I can imagine this growing beyond the number of hardware contexts is if there is a problem with the assumption that every thread that fires on-cpu also fires off-cpu. ?And if that''s the case, I would indeed call that a bug.Do your probe actions generate any trace data? If so then probe firing rate matters. Unless your probes are using destructive ops like chill() to the exclusion of any data-generating actions (including system()), and to the exclusion of any copyin() and friends, then your script can exhaust resources available to DTrace, leading to drops. The script you posted has a printf(), for example, which means that the probe in question generates data that can be dropped. The buffer into which the script''s data get stored is a circular buffer -- if the consumer (the user-land part of DTrace) does not consume the data from that buffer fast enough then DTrace can drop traces. You can help prevent this (but not guarantee that it won''t happen) by running the consumer at a higher priority than the victim process(es). Nico --
On Jul 11, 2011, at 12:39 PM, Nico Williams wrote:> On Mon, Jul 11, 2011 at 1:28 PM, Joe Pallas <pallas at cs.stanford.edu> wrote: >> The script provided looks entirely reasonable. It uses one thread-local variable which is set in sched:::on-cpu and cleared in sched:::off-cpu. Honestly, the only way I can imagine this growing beyond the number of hardware contexts is if there is a problem with the assumption that every thread that fires on-cpu also fires off-cpu. And if that''s the case, I would indeed call that a bug. > > Do your probe actions generate any trace data? If so then probe > firing rate matters. Unless your probes are using destructive ops > like chill() to the exclusion of any data-generating actions > (including system()), and to the exclusion of any copyin() and > friends, then your script can exhaust resources available to DTrace, > leading to drops. > > The script you posted has a printf(), for example, which means that > the probe in question generates data that can be dropped. The buffer > into which the script''s data get stored is a circular buffer -- if the > consumer (the user-land part of DTrace) does not consume the data from > that buffer fast enough then DTrace can drop traces. You can help > prevent this (but not guarantee that it won''t happen) by running the > consumer at a higher priority than the victim process(es).Thanks. Three points: 1) A clarification: "The script you posted" is not my script. I''m not Scott, who posted the script, nor am I Scott''s customer, who wrote the script. I''m just a member of the DTrace user community trying to be helpful. I happen to be an Oracle employee, but my involvement with Solaris and DTrace is strictly as a user. 2) The complaint is about "dynamic variable drops," so trace data from printf overrunning the principal buffer is not the issue in this particular case. 3) The script specified bufpolicy=ring. Unless I misunderstand the documentation, trace data should not generate drops with ring buffering, because it is always supposed to overwrite the oldest data. If I''m wrong about either of those last two points, I''m always happy to improve my understanding of DTrace. joe
First, I confess that I''m pretty uncomfortable answering Oracle support questions here. Oracle has very aggressively monetized the operating system with the notion that its support was of tremendous value; that Oracle has to essentially rely on the broader community for that support speaks volumes to the actual value that Oracle is provided its customers in this regard. Put more bluntly: let no one question the vitality of illumos when it is we in the illumos community who are fielding a support call about your operating system... Now, on to the meat here. The problem here is a bug in the script, albeit one that is due to some subtle semantics: threads that exit will fire sched:::on-cpu but never fire sched:::off-cpu. (That is, a thread exits while on the CPU; it does not give up the CPU to die.) This can, of course, be easily confirmed with DTrace itself; try this experiment: # dtrace -n sched:::on-cpu,sched:::off-cpu,proc:::lwp-exit''/pid =$target/{printf("%d: %s\n", timestamp, probename)}'' -q -c "/usr/bin/sleep 1" | sort -n When I run this on my illumos machine, I see: 153116965615211: on-cpu 153116965637283: off-cpu 153116965876279: on-cpu 153116966272791: off-cpu 153116966908637: on-cpu 153116966979203: off-cpu 153116967125718: on-cpu 153116967215732: off-cpu 153116967377948: on-cpu 153116967447287: off-cpu 153116967573339: on-cpu 153116968106208: off-cpu 153116968166568: on-cpu 153116968181498: off-cpu 153116968347277: on-cpu 153116968367427: off-cpu 153116980796598: on-cpu 153116980810590: off-cpu 153116980969177: on-cpu 153116981580498: off-cpu 153116981746760: on-cpu 153116981761285: off-cpu 153116981909940: on-cpu 153116981931360: off-cpu 153116996876523: on-cpu 153116996890628: off-cpu 153116997047335: on-cpu 153116999245574: off-cpu 153117999387773: on-cpu 153117999472949: lwp-exit So the script is leaking a dynamic variable whenever a thread that has been off CPU exits -- and it''s not at all surprising that dynamic variable drops are only seen after 24 hours and only under heavy load. To fix this, the customer should add a clause for proc:::lwp-exit that zeroes out their thread-local variable.... - Bryan