thr3ads.net - dtrace discuss - [dtrace-discuss] eliminating drops [Nov 2009]

If this information is useful, please help other people find it:
Share via:

Joel Reymont

2009-Nov-06 14:44 UTC

[dtrace-discuss] eliminating drops

I''m trying to collect function calls in Firefox in the time order.

I want just the XUL library but I must not miss any functions since I  
will be feeding the list to the linker for reordering of symbols.

I collect about 11Gb of function names but I can''t manage to eliminate
drops despite increasing the buffer size and switch rate, e.g.

grep CPU time.order
__ZL15SelectorMatchesR17RuleProcessorDataP13nsCSSSelectordtrace:  
684061 drops on CPU 0
__ZNK19ndtrace: 205307 drops on CPU 0

Should I further increase the buffer size? Any other suggestions?

In reality, I just want the first invocation of each function but I  
think keeping track of calls will take too much memory in DTrace, e.g.  
one associative array for the time order, indexed on timestamp or  
similar, plus another array indexed on the function name.

	Thanks, Joel

---

#pragma D option mangled
#pragma D option switchrate=10hz
#pragma D option bufsize=512m

pid$target:XUL::entry
{
   printf("%s\n", probefunc);
}

---
firefox for android!
http://wagerlabs.com

James McIlree

2009-Nov-06 18:29 UTC

head link

[dtrace-discuss] eliminating drops

On Nov 6, 2009, at 6:44 AM, Joel Reymont wrote:
> I''m trying to collect function calls in Firefox in the time order.
>
> I want just the XUL library but I must not miss any functions since  
> I will be feeding the list to the linker for reordering of symbols.
>
> I collect about 11Gb of function names but I can''t manage to  
> eliminate drops despite increasing the buffer size and switch rate,  
> e.g.
>
> grep CPU time.order
> __ZL15SelectorMatchesR17RuleProcessorDataP13nsCSSSelectordtrace:  
> 684061 drops on CPU 0
> __ZNK19ndtrace: 205307 drops on CPU 0
>
> Should I further increase the buffer size? Any other suggestions?
>
> In reality, I just want the first invocation of each function but I  
> think keeping track of calls will take too much memory in DTrace,  
> e.g. one associative array for the time order, indexed on timestamp  
> or similar, plus another array indexed on the function name.
>
> 	Thanks, Joel
>
> ---
>
> #pragma D option mangled
> #pragma D option switchrate=10hz
> #pragma D option bufsize=512m
>
> pid$target:XUL::entry
> {
>  printf("%s\n", probefunc);
> }

Joel,

	I heard from another party that they had spent a lot of time looking  
at drops on OS X, and what they saw were
drops coming from long periods of the consumer (dtrace client) not  
being on cpu.

	This was with a very high switch rate and large buffers already.

	You might try nice''ing up your dtrace client to a very high priority  
and see if that helps any.

	Another thing you can try if you are on a desktop 8 / 16 way  
machine... Turn off some cpu''s.

	Each cpu consumes some amount of memory, and ultimately limits how  
big you can make the DTrace buffers.
Reducing the count means your biggest possible buffer can be larger.  
Its also a producer vs consumer race, with
one consumer (dtrace) and N producers. Making N smaller means the  
consumer can keep up a little bit easier.

	James M

James Litchfield

2009-Nov-06 22:20 UTC

head link

[dtrace-discuss] eliminating drops

Dtrace is a single-threaded process. It pulls the buffers out of the 
system on
a cpu-by-cpu basis and only when all CPU buffers are pulled out does it 
output
the data. The errors indicate however, that the kernel buffers are 
overflowing.
The only solution is to make the buffers bigger and/or increase the 
switch rate.
Even if you greatly increase those rates/sizes, however, the work done 
by DTrace
in user space may well impede the ability to extract the buffers at the 
necessary rate.

Jim Litchfield
--------------------

James McIlree wrote:>
> On Nov 6, 2009, at 6:44 AM, Joel Reymont wrote:
>
>> I''m trying to collect function calls in Firefox in the time
order.
>>
>> I want just the XUL library but I must not miss any functions since I 
>> will be feeding the list to the linker for reordering of symbols.
>>
>> I collect about 11Gb of function names but I can''t manage to 
>> eliminate drops despite increasing the buffer size and switch rate,
e.g.
>>
>> grep CPU time.order
>> __ZL15SelectorMatchesR17RuleProcessorDataP13nsCSSSelectordtrace: 
>> 684061 drops on CPU 0
>> __ZNK19ndtrace: 205307 drops on CPU 0
>>
>> Should I further increase the buffer size? Any other suggestions?
>>
>> In reality, I just want the first invocation of each function but I 
>> think keeping track of calls will take too much memory in DTrace, 
>> e.g. one associative array for the time order, indexed on timestamp 
>> or similar, plus another array indexed on the function name.
>>
>>     Thanks, Joel
>>
>> ---
>>
>> #pragma D option mangled
>> #pragma D option switchrate=10hz
>> #pragma D option bufsize=512m
>>
>> pid$target:XUL::entry
>> {
>>  printf("%s\n", probefunc);
>> }
>
>
> Joel,
>
>     I heard from another party that they had spent a lot of time 
> looking at drops on OS X, and what they saw were
> drops coming from long periods of the consumer (dtrace client) not 
> being on cpu.
>
>     This was with a very high switch rate and large buffers already.
>
>     You might try nice''ing up your dtrace client to a very high 
> priority and see if that helps any.
>
>     Another thing you can try if you are on a desktop 8 / 16 way 
> machine... Turn off some cpu''s.
>
>     Each cpu consumes some amount of memory, and ultimately limits how 
> big you can make the DTrace buffers.
> Reducing the count means your biggest possible buffer can be larger. 
> Its also a producer vs consumer race, with
> one consumer (dtrace) and N producers. Making N smaller means the 
> consumer can keep up a little bit easier.
>
>     James M
>
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
>

Ryan

2009-Nov-11 16:33 UTC

head link

[dtrace-discuss] eliminating drops

> I''m trying to collect function calls in Firefox in
> the time order.
> 
> I want just the XUL library but I must not miss any
> functions since I  
> will be feeding the list to the linker for reordering
> of symbols.
> 
> I collect about 11Gb of function names but I can''t
> manage to eliminate  
> drops despite increasing the buffer size and switch
> rate, e.g.
> 
> grep CPU time.order
> __ZL15SelectorMatchesR17RuleProcessorDataP13nsCSSSelec
> tordtrace:  
> 684061 drops on CPU 0
> __ZNK19ndtrace: 205307 drops on CPU 0
> 
> Should I further increase the buffer size? Any other
> suggestions?
> 
> In reality, I just want the first invocation of each
> function but I  
> think keeping track of calls will take too much
> memory in DTrace, e.g.  
> one associative array for the time order, indexed on
> timestamp or  
> similar, plus another array indexed on the function
> name.
How big could the associative arrays possibly be compared to the buffer sizes
you''re considering? Elfdump could probably give you an upper bound, but
I''d just try it and then look for alternatives if it fails...

Or, use the tick provider to get the most common functions, then finish off the
list with the output of elfdump. There''s no guarantee that *every*
function would get called during a run anyway, and by definition anything the
profiler missed shouldn''t impact performance much.

Regards,
Ryan
-- 
This message posted from opensolaris.org

James Litchfield

2009-Nov-12 13:30 UTC

head link

[dtrace-discuss] eliminating drops

The issue is most likely that he''s doing printf for every record.
The userland dtrace process is doing this:

    + read kernel buffers from each cpu into user space.
    + process kernel buffers in user space (in this case eventually
       doing a printf). If the buffers contain lots of data, he''s
likely
       not finishing the printfs before the buffers in the kernel
       overflow.

Jim Litchfield
-------------------

Ryan wrote:>> I''m trying to collect function calls in Firefox in
>> the time order.
>>
>> I want just the XUL library but I must not miss any
>> functions since I  
>> will be feeding the list to the linker for reordering
>> of symbols.
>>
>> I collect about 11Gb of function names but I can''t
>> manage to eliminate  
>> drops despite increasing the buffer size and switch
>> rate, e.g.
>>
>> grep CPU time.order
>> __ZL15SelectorMatchesR17RuleProcessorDataP13nsCSSSelec
>> tordtrace:  
>> 684061 drops on CPU 0
>> __ZNK19ndtrace: 205307 drops on CPU 0
>>
>> Should I further increase the buffer size? Any other
>> suggestions?
>>
>> In reality, I just want the first invocation of each
>> function but I  
>> think keeping track of calls will take too much
>> memory in DTrace, e.g.  
>> one associative array for the time order, indexed on
>> timestamp or  
>> similar, plus another array indexed on the function
>> name.
>>     
>
> How big could the associative arrays possibly be compared to the buffer
sizes you''re considering? Elfdump could probably give you an upper
bound, but I''d just try it and then look for alternatives if it
fails...
>
> Or, use the tick provider to get the most common functions, then finish off
the list with the output of elfdump. There''s no guarantee that *every*
function would get called during a run anyway, and by definition anything the
profiler missed shouldn''t impact performance much.
>
> Regards,
> Ryan
>

Jonathan Adams

2009-Nov-13 07:33 UTC

head link

[dtrace-discuss] eliminating drops

On Fri, Nov 06, 2009 at 02:44:22PM +0000, Joel Reymont
wrote:> I''m trying to collect function calls in Firefox in the time order.
> 
> I want just the XUL library but I must not miss any functions since I  
> will be feeding the list to the linker for reordering of symbols.
> 
> I collect about 11Gb of function names but I can''t manage to
eliminate
> drops despite increasing the buffer size and switch rate, e.g.
> 
> grep CPU time.order
> __ZL15SelectorMatchesR17RuleProcessorDataP13nsCSSSelectordtrace:  
> 684061 drops on CPU 0
> __ZNK19ndtrace: 205307 drops on CPU 0
> 
> Should I further increase the buffer size? Any other suggestions?
> 
> In reality, I just want the first invocation of each function but I  
> think keeping track of calls will take too much memory in DTrace, e.g.  
> one associative array for the time order, indexed on timestamp or  
> similar, plus another array indexed on the function name.
This is custom-fit for an aggregation.  Try:

pid$target:library::entry
{
	@a[probefunc] = min(timestamp);
}

The aggregation will automatically only keep one record per function per CPU,
which should significantly reduce the data traffic.

Cheers,
- jonathan

Joel Reymont

2009-Nov-13 10:00 UTC

head link

[dtrace-discuss] eliminating drops

I thought about this but how do I preserve the order of the function calls?

On Nov 13, 2009, at 7:33 AM, Jonathan Adams wrote:
> This is custom-fit for an aggregation.  Try:
> 
> pid$target:library::entry
> {
> 	@a[probefunc] = min(timestamp);
> }
> 
> The aggregation will automatically only keep one record per function per
CPU,
> which should significantly reduce the data traffic.
> 
> Cheers,
> - jonathan
> 
---
firefox for android!
http://wagerlabs.com

Ryan

2009-Nov-13 10:15 UTC

head link

[dtrace-discuss] eliminating drops

If you''re optimiznig link-time code placement I would think
you''d just need a set of caller-callee relationships along with their
frequency?

If this is the case, just track function callers and make that part of the
aggregation:

pid$target:library::entry
{
        @a[usym(ucaller),probefunc] = count();
}

Running that on bash during an ''ls'' gives the following output
(just the last few lines):

bash`0x3a94c hash_search 40
bash`hash_search hash_string 46
libc.so.1`rw_write_held no_preempt 67
libc.so.1`rw_write_held preempt 67
bash`sh_xmalloc sh_malloc 130

Regards,
Ryan
> I thought about this but how do I preserve the order
> of the function calls?
> 
> On Nov 13, 2009, at 7:33 AM, Jonathan Adams wrote:
> 
> > This is custom-fit for an aggregation.  Try:
> > 
> > pid$target:library::entry
> > {
> > 	@a[probefunc] = min(timestamp);
> > }
> > 
> > The aggregation will automatically only keep one
> record per function per CPU,
> > which should significantly reduce the data traffic.
> > 
> > Cheers,
> > - jonathan
> > 
> 
> ---
> firefox for android!
> http://wagerlabs.com
> 
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org-- 
This message posted from opensolaris.org

Joel Reymont

2009-Nov-13 10:27 UTC

head link

[dtrace-discuss] eliminating drops

Ryan,

On Nov 13, 2009, at 10:15 AM, Ryan wrote:
> If you''re optimiznig link-time code placement I would think
you''d just need a set of caller-callee relationships along with their
frequency?
That would give me the "frequency order" but I''d like to have
the "call order" as well.

	Thanks, Joel

---
firefox for android!
http://wagerlabs.com

Jonathan Adams

2009-Nov-13 15:22 UTC

head link

[dtrace-discuss] eliminating drops

On Fri, Nov 13, 2009 at 10:27:05AM +0000, Joel Reymont
wrote:> Ryan,
> 
> On Nov 13, 2009, at 10:15 AM, Ryan wrote:
> 
> > If you''re optimiznig link-time code placement I would think
you''d just need a set of caller-callee relationships along with their
frequency?
> 
> That would give me the "frequency order" but I''d like to
have the "call order" as well.
pid$target:library::entry
{
        @a[usym(ucaller),probefunc] = count();
        @b[usym(ucaller),probefunc] = min(timestamp);
}

If you''re worried about multiple threads of execution, then:

pid$target:library::entry
{
        @a[usym(ucaller),probefunc] = count();
        @b[tid, usym(ucaller),probefunc] = min(timestamp);
}

will separate them out.

Cheers,
- jonathan

Joel Reymont

2009-Nov-13 15:37 UTC

head link

[dtrace-discuss] eliminating drops

Jonathan,

On Nov 13, 2009, at 3:22 PM, Jonathan Adams wrote:
> pid$target:library::entry
> {
>        @a[usym(ucaller),probefunc] = count();
>        @b[usym(ucaller),probefunc] = min(timestamp);
> }
I recall you can sort associative arrays by key 
but can you actually sort them by value?

What exactly keeps @b ordered by call time?

How would I print @b such that the functions 
appear in the order they were called?

	Thanks, Joel

---
firefox for android!
http://wagerlabs.com

Jonathan Adams

2009-Nov-13 18:05 UTC

head link

[dtrace-discuss] eliminating drops

On Fri, Nov 13, 2009 at 03:37:28PM +0000, Joel Reymont
wrote:> Jonathan,
> 
> On Nov 13, 2009, at 3:22 PM, Jonathan Adams wrote:
> 
> > pid$target:library::entry
> > {
> >        @a[usym(ucaller),probefunc] = count();
> >        @b[usym(ucaller),probefunc] = min(timestamp);
> > }
> 
> I recall you can sort associative arrays by key 
> but can you actually sort them by value?
These are not associative arrays;  they''re aggregations.  They sort by
value
by default.
> What exactly keeps @b ordered by call time?
@b is an aggregation on minimum timestamp, grouped by (calling function
(caller), called function (callee)).  This means that for each (caller,
callee) pair, the aggregation reports the minimum timestamp seen.  Since
timestamp is monotonically increasing, the min value of timestamp for a
given (caller, callee) pair is the time of the first invocation, and the
aggregation sorted by timestamp is the (caller, callee) graph in initial-
function-call order.
> How would I print @b such that the functions 
> appear in the order they were called?
It should just print it that way by default.

Cheers,
- jonathan

dtrace discuss - Nov 2009 - eliminating drops

[dtrace-discuss] eliminating drops

[dtrace-discuss] eliminating drops

[dtrace-discuss] eliminating drops

[dtrace-discuss] eliminating drops

[dtrace-discuss] eliminating drops

[dtrace-discuss] eliminating drops

[dtrace-discuss] eliminating drops

[dtrace-discuss] eliminating drops

[dtrace-discuss] eliminating drops

[dtrace-discuss] eliminating drops

[dtrace-discuss] eliminating drops

[dtrace-discuss] eliminating drops