Michael Ernest
2011-Jul-11 15:44 UTC
[dtrace-discuss] dtrace-discuss Digest, Vol 75, Issue 8
I can''t see the attachment on the discussion board yet, but it sounds like your customer thinks a thread-local variable represents a hardware resource, not a software entity of which there might be many many thousands of them, perhaps hundreds of thousands that accumulate over time. In particular, if the values associated with TLVs are not zeroed out after use, they can accumulate over time and most certainly can consume dynvar space. My first bet is we''ll see in the script that the TLVs are not being set to zero after they''ve been used. Regards, Michael On Mon, Jul 11, 2011 at 8:21 AM, <dtrace-discuss-request at opensolaris.org>wrote:> Send dtrace-discuss mailing list submissions to > dtrace-discuss at opensolaris.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.opensolaris.org/mailman/listinfo/dtrace-discuss > or, via email, send a message with subject or body ''help'' to > dtrace-discuss-request at opensolaris.org > > You can reach the person managing the list at > dtrace-discuss-owner at opensolaris.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of dtrace-discuss digest..." > > > Today''s Topics: > > 1. Re: CPU dispatcher and buffer questions (Scott Shurr) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 11 Jul 2011 11:20:39 -0400 > From: Scott Shurr <scott.shurr at oracle.com> > To: dtrace-discuss at opensolaris.org > Subject: Re: [dtrace-discuss] CPU dispatcher and buffer questions > Message-ID: <4E1B14C7.5030603 at oracle.com> > Content-Type: text/plain; charset="windows-1252"; Format="flowed" > > My customer still thinks this is a bug: > > ********** > We think that DTrace is not working as designed. Therefore we wanted to > report a bug for DTrace. Thank you for the answers from mailing list, > but these do not solve our problem. > > The DTrace script that we describe in the document results in "dynamic > variable drops", if the system is heavily loaded for a longer period of > time (around 24 hours). We assume that this is not the intended behavior > of DTrace. > > We use a DTrace script with thread local variables (self->) and still > get "dynamic variable drops". DTrace has to be used for a longer period > of time in our systems, where the utilization of the system can be high. > In case of "dynamic variable drops" incorrect tracing is performed. As > described in the previous document we think our script is working > correctly and therefore we think that DTrace is working incorrectly. We > would like a solution for this. > > The previous document contains some assumptions that we had to make > about DTrace, but which we could verify using the DTrace documentation. > In our script we use thread local variables (self->). We assume that > these variables result in at most one variable per hardware thread(which > is a part of a CPU). Because the thread local variable at a CPU can be > reused in consecutive calls of the probes and we use no other variables, > we assume that our script is not causing a full dynamic variable space. > But, DTrace appears to be doing something else that causes the dynamic > variable space to get full (for a loaded system after 24 hours), because > we still get "dynamic variable drops". This makes the DTrace solution > unreliable. We would like to see a solution for this problem. > > I attached the version of the DTrace script that we use. For its > configuration, this script depends upon a process for its configuration. > After running this script for approximately 24 hours on a heavily loaded > machine, we get dynamic variable drops. > ********** > > It is my belief that this is not a bug, but I need something more to > give the customer to convince him of this. I''ve attached his script CSET.d > Thanks > > > > Scott Shurr| Solaris and Network Domain, Global Systems Support > Email: Scott.Shurr at oracle.com <mailto:Scott.Shurr at oracle.com> > Phone: 781-442-1352 > Oracle Global Customer Services > > Log, update, and monitor your Service Request online using My Oracle > Support <http://www.oracle.com/us/support/044752.html> > > > > > On 07/01/11 10:58, Jim Mauro wrote: > > I''m not sure I understand what is being asked here, but I''ll take a > > shot... > > > > Note it is virtually impossible to write a piece of software that is > > guaranteed > > to have sufficient space to buffer a given amount of data when the rate > > and size of the data flow is unknown. This is one of the robustness > > features > > of dtrace - it''s smart enough to know that, and smart enough to let > > the user > > know when data can not be buffered. > > > > Yes, buffers are allocated per-CPU. There are several buffer types, > > depending > > on the dtrace invocation. Minimally, principle buffers are allocated > > per CPU > > when a dtrace consumer (dtrace(1M)) is executed. Read; > > http://wikis.sun.com/display/DTrace/Buffers+and+Buffering > > > > The "self->read" describes a thread local variable, one of several > > variable > > types available in DTrace. It defines the variable scope - each kernel > > thread > > that''s on the CPU when the probe(s) fires will have it''s own copy of a > > "self->" variable. > > > > There is only one kernel dispatcher, not one per CPU. There are > > per-CPU run > > queues managed by the dispatcher. > > > > As for running a DTrace script for hours/days/weeks, I have never been > > down that > > road. It is theoretically possible of course, and seems to be a good > > use of > > speculative buffers or a ring buffer policy. > > > > We can not guarantee it will execute without errors ("dynamic variable > > drops", etc). > > We can guarantee you''ll know when errors occur. > > > > How can such guarantees be made with a dynamic tool like dtrace? > > Does your customer know up-front how much data will be traced/processed/ > > consumed, and at what rate? > > > > Read this; > > http://blogs.oracle.com/bmc/resource/dtrace_tips.pdf > > > > Thanks > > /jim > > > > On Jul 1, 2011, at 9:30 AM, Scott Shurr wrote: > > > >> Hello, > >> I have a customer who has some dtrace questions. I am guessing that > >> someone knows the answer to these, so I am asking here. Here are the > >> questions: > >> > >> ********** > >> In this document, we will describe how we assume that DTrace uses its > >> memory. Most assumptions result from [1]. We want these assumptions > >> to be validated by a DTrace expert from Oracle. This validation is > >> necessary to provide us confidence that DTrace can execute for a long > >> period of time (in the order of weeks) along with our software, > >> without introducing errors due to e.g. ?dynamic variable drops?. In > >> addition, we described a problem we experience with our DTrace > >> script, for which we want to have support from you. > >> > >> [1] Sun Microsystems inc, ?Solaris Dynamic Tracing Guide?, September > >> 2008. > >> Quotes from Solaris Dynamic Tracing Guide [1], with interpretation: > >> ? ?Each time the variable self->read is referenced in your D > >> program, the data object referenced is the one associated with the > >> operating system thread that was executing when the corresponding > >> DTrace probe fired.? > >> o Interpretation: Per CPU there is a dispatcher that has its own > >> thread, when it executes the sched:::on-cpu and sched:::off probes. > >> ? ?At that time, the ring buffer is consumed and processed. dtrace > >> processes each ring buffer in CPU order. Within a CPU''s buffer, trace > >> records will be displayed in order from oldest to youngest.? > >> Interpretation: There is a principal buffer per CPU > >> > >> 1) Impact on Business > >> We have a number of assumptions that we would like to verify about > >> DTrace. > >> > >> 2) What is the OS version and the kernel patch level of the system? > >> SunOS nlvdhe321 5.10 Generic_141444-09 sun4v sparc SUNW,T5240 > >> > >> 3) What is the Firmware level of the system? > >> SP firmware 3.0.10.2.b > >> SP firmware build number: 56134 > >> SP firmware date: Tue May 25 13:02:56 PDT 2010 > >> SP filesystem version: 0.1.22 > >> ********** > >> Thanks > >> > >> <oracle.jpg> > >> > >> Scott Shurr| Solaris and Network Domain, Global Systems Support > >> Email: Scott.Shurr at oracle.com <mailto:Scott.Shurr at oracle.com> > >> Phone: 781-442-1352 > >> Oracle Global Customer Services > >> > >> Log, update, and monitor your Service Request online using My Oracle > >> Support <http://www.oracle.com/us/support/044752.html> > >> > >> > >> _______________________________________________ > >> dtrace-discuss mailing list > >> dtrace-discuss at opensolaris.org <mailto:dtrace-discuss at opensolaris.org> > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20110711/30ac0744/attachment.html > > > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: oracle.jpg > Type: image/jpeg > Size: 8717 bytes > Desc: not available > URL: < > http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20110711/30ac0744/attachment.jpg > > > -------------- next part -------------- > An embedded and charset-unspecified text was scrubbed... > Name: CSET.d > URL: < > http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20110711/30ac0744/attachment.ksh > > > > ------------------------------ > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org > > End of dtrace-discuss Digest, Vol 75, Issue 8 > ********************************************* >-- Michael Ernest Inkling Research, Inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20110711/55298dcf/attachment.html>