I''m trying to collect function calls in Firefox in the time order. I want just the XUL library but I must not miss any functions since I will be feeding the list to the linker for reordering of symbols. I collect about 11Gb of function names but I can''t manage to eliminate drops despite increasing the buffer size and switch rate, e.g. grep CPU time.order __ZL15SelectorMatchesR17RuleProcessorDataP13nsCSSSelectordtrace: 684061 drops on CPU 0 __ZNK19ndtrace: 205307 drops on CPU 0 Should I further increase the buffer size? Any other suggestions? In reality, I just want the first invocation of each function but I think keeping track of calls will take too much memory in DTrace, e.g. one associative array for the time order, indexed on timestamp or similar, plus another array indexed on the function name. Thanks, Joel --- #pragma D option mangled #pragma D option switchrate=10hz #pragma D option bufsize=512m pid$target:XUL::entry { printf("%s\n", probefunc); } --- firefox for android! http://wagerlabs.com
On Nov 6, 2009, at 6:44 AM, Joel Reymont wrote:> I''m trying to collect function calls in Firefox in the time order. > > I want just the XUL library but I must not miss any functions since > I will be feeding the list to the linker for reordering of symbols. > > I collect about 11Gb of function names but I can''t manage to > eliminate drops despite increasing the buffer size and switch rate, > e.g. > > grep CPU time.order > __ZL15SelectorMatchesR17RuleProcessorDataP13nsCSSSelectordtrace: > 684061 drops on CPU 0 > __ZNK19ndtrace: 205307 drops on CPU 0 > > Should I further increase the buffer size? Any other suggestions? > > In reality, I just want the first invocation of each function but I > think keeping track of calls will take too much memory in DTrace, > e.g. one associative array for the time order, indexed on timestamp > or similar, plus another array indexed on the function name. > > Thanks, Joel > > --- > > #pragma D option mangled > #pragma D option switchrate=10hz > #pragma D option bufsize=512m > > pid$target:XUL::entry > { > printf("%s\n", probefunc); > }Joel, I heard from another party that they had spent a lot of time looking at drops on OS X, and what they saw were drops coming from long periods of the consumer (dtrace client) not being on cpu. This was with a very high switch rate and large buffers already. You might try nice''ing up your dtrace client to a very high priority and see if that helps any. Another thing you can try if you are on a desktop 8 / 16 way machine... Turn off some cpu''s. Each cpu consumes some amount of memory, and ultimately limits how big you can make the DTrace buffers. Reducing the count means your biggest possible buffer can be larger. Its also a producer vs consumer race, with one consumer (dtrace) and N producers. Making N smaller means the consumer can keep up a little bit easier. James M
Dtrace is a single-threaded process. It pulls the buffers out of the system on a cpu-by-cpu basis and only when all CPU buffers are pulled out does it output the data. The errors indicate however, that the kernel buffers are overflowing. The only solution is to make the buffers bigger and/or increase the switch rate. Even if you greatly increase those rates/sizes, however, the work done by DTrace in user space may well impede the ability to extract the buffers at the necessary rate. Jim Litchfield -------------------- James McIlree wrote:> > On Nov 6, 2009, at 6:44 AM, Joel Reymont wrote: > >> I''m trying to collect function calls in Firefox in the time order. >> >> I want just the XUL library but I must not miss any functions since I >> will be feeding the list to the linker for reordering of symbols. >> >> I collect about 11Gb of function names but I can''t manage to >> eliminate drops despite increasing the buffer size and switch rate, e.g. >> >> grep CPU time.order >> __ZL15SelectorMatchesR17RuleProcessorDataP13nsCSSSelectordtrace: >> 684061 drops on CPU 0 >> __ZNK19ndtrace: 205307 drops on CPU 0 >> >> Should I further increase the buffer size? Any other suggestions? >> >> In reality, I just want the first invocation of each function but I >> think keeping track of calls will take too much memory in DTrace, >> e.g. one associative array for the time order, indexed on timestamp >> or similar, plus another array indexed on the function name. >> >> Thanks, Joel >> >> --- >> >> #pragma D option mangled >> #pragma D option switchrate=10hz >> #pragma D option bufsize=512m >> >> pid$target:XUL::entry >> { >> printf("%s\n", probefunc); >> } > > > Joel, > > I heard from another party that they had spent a lot of time > looking at drops on OS X, and what they saw were > drops coming from long periods of the consumer (dtrace client) not > being on cpu. > > This was with a very high switch rate and large buffers already. > > You might try nice''ing up your dtrace client to a very high > priority and see if that helps any. > > Another thing you can try if you are on a desktop 8 / 16 way > machine... Turn off some cpu''s. > > Each cpu consumes some amount of memory, and ultimately limits how > big you can make the DTrace buffers. > Reducing the count means your biggest possible buffer can be larger. > Its also a producer vs consumer race, with > one consumer (dtrace) and N producers. Making N smaller means the > consumer can keep up a little bit easier. > > James M > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
> I''m trying to collect function calls in Firefox in > the time order. > > I want just the XUL library but I must not miss any > functions since I > will be feeding the list to the linker for reordering > of symbols. > > I collect about 11Gb of function names but I can''t > manage to eliminate > drops despite increasing the buffer size and switch > rate, e.g. > > grep CPU time.order > __ZL15SelectorMatchesR17RuleProcessorDataP13nsCSSSelec > tordtrace: > 684061 drops on CPU 0 > __ZNK19ndtrace: 205307 drops on CPU 0 > > Should I further increase the buffer size? Any other > suggestions? > > In reality, I just want the first invocation of each > function but I > think keeping track of calls will take too much > memory in DTrace, e.g. > one associative array for the time order, indexed on > timestamp or > similar, plus another array indexed on the function > name.How big could the associative arrays possibly be compared to the buffer sizes you''re considering? Elfdump could probably give you an upper bound, but I''d just try it and then look for alternatives if it fails... Or, use the tick provider to get the most common functions, then finish off the list with the output of elfdump. There''s no guarantee that *every* function would get called during a run anyway, and by definition anything the profiler missed shouldn''t impact performance much. Regards, Ryan -- This message posted from opensolaris.org
The issue is most likely that he''s doing printf for every record. The userland dtrace process is doing this: + read kernel buffers from each cpu into user space. + process kernel buffers in user space (in this case eventually doing a printf). If the buffers contain lots of data, he''s likely not finishing the printfs before the buffers in the kernel overflow. Jim Litchfield ------------------- Ryan wrote:>> I''m trying to collect function calls in Firefox in >> the time order. >> >> I want just the XUL library but I must not miss any >> functions since I >> will be feeding the list to the linker for reordering >> of symbols. >> >> I collect about 11Gb of function names but I can''t >> manage to eliminate >> drops despite increasing the buffer size and switch >> rate, e.g. >> >> grep CPU time.order >> __ZL15SelectorMatchesR17RuleProcessorDataP13nsCSSSelec >> tordtrace: >> 684061 drops on CPU 0 >> __ZNK19ndtrace: 205307 drops on CPU 0 >> >> Should I further increase the buffer size? Any other >> suggestions? >> >> In reality, I just want the first invocation of each >> function but I >> think keeping track of calls will take too much >> memory in DTrace, e.g. >> one associative array for the time order, indexed on >> timestamp or >> similar, plus another array indexed on the function >> name. >> > > How big could the associative arrays possibly be compared to the buffer sizes you''re considering? Elfdump could probably give you an upper bound, but I''d just try it and then look for alternatives if it fails... > > Or, use the tick provider to get the most common functions, then finish off the list with the output of elfdump. There''s no guarantee that *every* function would get called during a run anyway, and by definition anything the profiler missed shouldn''t impact performance much. > > Regards, > Ryan >
On Fri, Nov 06, 2009 at 02:44:22PM +0000, Joel Reymont wrote:> I''m trying to collect function calls in Firefox in the time order. > > I want just the XUL library but I must not miss any functions since I > will be feeding the list to the linker for reordering of symbols. > > I collect about 11Gb of function names but I can''t manage to eliminate > drops despite increasing the buffer size and switch rate, e.g. > > grep CPU time.order > __ZL15SelectorMatchesR17RuleProcessorDataP13nsCSSSelectordtrace: > 684061 drops on CPU 0 > __ZNK19ndtrace: 205307 drops on CPU 0 > > Should I further increase the buffer size? Any other suggestions? > > In reality, I just want the first invocation of each function but I > think keeping track of calls will take too much memory in DTrace, e.g. > one associative array for the time order, indexed on timestamp or > similar, plus another array indexed on the function name.This is custom-fit for an aggregation. Try: pid$target:library::entry { @a[probefunc] = min(timestamp); } The aggregation will automatically only keep one record per function per CPU, which should significantly reduce the data traffic. Cheers, - jonathan
I thought about this but how do I preserve the order of the function calls? On Nov 13, 2009, at 7:33 AM, Jonathan Adams wrote:> This is custom-fit for an aggregation. Try: > > pid$target:library::entry > { > @a[probefunc] = min(timestamp); > } > > The aggregation will automatically only keep one record per function per CPU, > which should significantly reduce the data traffic. > > Cheers, > - jonathan >--- firefox for android! http://wagerlabs.com
If you''re optimiznig link-time code placement I would think you''d just need a set of caller-callee relationships along with their frequency? If this is the case, just track function callers and make that part of the aggregation: pid$target:library::entry { @a[usym(ucaller),probefunc] = count(); } Running that on bash during an ''ls'' gives the following output (just the last few lines): bash`0x3a94c hash_search 40 bash`hash_search hash_string 46 libc.so.1`rw_write_held no_preempt 67 libc.so.1`rw_write_held preempt 67 bash`sh_xmalloc sh_malloc 130 Regards, Ryan> I thought about this but how do I preserve the order > of the function calls? > > On Nov 13, 2009, at 7:33 AM, Jonathan Adams wrote: > > > This is custom-fit for an aggregation. Try: > > > > pid$target:library::entry > > { > > @a[probefunc] = min(timestamp); > > } > > > > The aggregation will automatically only keep one > record per function per CPU, > > which should significantly reduce the data traffic. > > > > Cheers, > > - jonathan > > > > --- > firefox for android! > http://wagerlabs.com > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- This message posted from opensolaris.org
Ryan, On Nov 13, 2009, at 10:15 AM, Ryan wrote:> If you''re optimiznig link-time code placement I would think you''d just need a set of caller-callee relationships along with their frequency?That would give me the "frequency order" but I''d like to have the "call order" as well. Thanks, Joel --- firefox for android! http://wagerlabs.com
On Fri, Nov 13, 2009 at 10:27:05AM +0000, Joel Reymont wrote:> Ryan, > > On Nov 13, 2009, at 10:15 AM, Ryan wrote: > > > If you''re optimiznig link-time code placement I would think you''d just need a set of caller-callee relationships along with their frequency? > > That would give me the "frequency order" but I''d like to have the "call order" as well.pid$target:library::entry { @a[usym(ucaller),probefunc] = count(); @b[usym(ucaller),probefunc] = min(timestamp); } If you''re worried about multiple threads of execution, then: pid$target:library::entry { @a[usym(ucaller),probefunc] = count(); @b[tid, usym(ucaller),probefunc] = min(timestamp); } will separate them out. Cheers, - jonathan
Jonathan, On Nov 13, 2009, at 3:22 PM, Jonathan Adams wrote:> pid$target:library::entry > { > @a[usym(ucaller),probefunc] = count(); > @b[usym(ucaller),probefunc] = min(timestamp); > }I recall you can sort associative arrays by key but can you actually sort them by value? What exactly keeps @b ordered by call time? How would I print @b such that the functions appear in the order they were called? Thanks, Joel --- firefox for android! http://wagerlabs.com
On Fri, Nov 13, 2009 at 03:37:28PM +0000, Joel Reymont wrote:> Jonathan, > > On Nov 13, 2009, at 3:22 PM, Jonathan Adams wrote: > > > pid$target:library::entry > > { > > @a[usym(ucaller),probefunc] = count(); > > @b[usym(ucaller),probefunc] = min(timestamp); > > } > > I recall you can sort associative arrays by key > but can you actually sort them by value?These are not associative arrays; they''re aggregations. They sort by value by default.> What exactly keeps @b ordered by call time?@b is an aggregation on minimum timestamp, grouped by (calling function (caller), called function (callee)). This means that for each (caller, callee) pair, the aggregation reports the minimum timestamp seen. Since timestamp is monotonically increasing, the min value of timestamp for a given (caller, callee) pair is the time of the first invocation, and the aggregation sorted by timestamp is the (caller, callee) graph in initial- function-call order.> How would I print @b such that the functions > appear in the order they were called?It should just print it that way by default. Cheers, - jonathan