Hello. I''m trying to implement many probes but have them only fire sometimes and am seeing significant slowdown. In the users manual, it talks about how caching can speed things up. I don''t see the savings. For a simple example, consider the following two probes: ------------------------test.d (no caching of predicate possible) int global_trace; pid$target:a.out::entry / global_trace == 1 / { @[probefunc] = count(); } ---------------------self.d (Can cache) int self do_trace; pid$target:a.out::entry / self->do_trace == 1 / { @[probefunc] = count(); } When I run these probes, I don''t see any "noticable" performance gain (I know the probes don''t actually DO anything) bash-3.00# time dtrace -s test.d -c "gzip zip" dtrace: script ''test.d'' matched 56 probes dtrace: pid 5633 has exited real 0m1.431s user 0m0.356s sys 0m0.312s bash-3.00# time dtrace -s self.d -c "gzip zip" dtrace: script ''self.d'' matched 56 probes dtrace: pid 5638 has exited real 0m1.254s user 0m0.352s sys 0m0.250s Compare this to running without any tracing: bash-3.00# time gzip zip real 0m0.025s user 0m0.012s sys 0m0.010s Caching is 12% faster in this case, but both are 50x slower than when not being traced. Am I missing something here or is this expected behavior? Thanks for any insight anyone may have on this. Charlie This message posted from opensolaris.org
Hey Charlie, Unfortunately the biggest slice of the performance hit is going to be the actual instrumentation itself. The pid provider invocation you''re using turns every function call into a trip in and out of the kernel with a bunch of work along the way. As you''ve seen, using a cachable predicate will help a bit, but evaluating the predicate isn''t taking much time comparatively. Is there a way you can narrow your focus to come up with a DTrace invocation that enables fewer probes? Adam On Thu, Nov 16, 2006 at 12:15:02PM -0800, Charlie Miller wrote:> Hello. I''m trying to implement many probes but have them only fire sometimes and am seeing significant slowdown. In the users manual, it talks about how caching can speed things up. I don''t see the savings. For a simple example, consider the following two probes: > > ------------------------test.d (no caching of predicate possible) > int global_trace; > > pid$target:a.out::entry > / global_trace == 1 / > { > @[probefunc] = count(); > } > > ---------------------self.d (Can cache) > int self do_trace; > > pid$target:a.out::entry > / self->do_trace == 1 / > { > @[probefunc] = count(); > } > > When I run these probes, I don''t see any "noticable" performance gain (I know the probes don''t actually DO anything) > > bash-3.00# time dtrace -s test.d -c "gzip zip" > dtrace: script ''test.d'' matched 56 probes > dtrace: pid 5633 has exited > > > real 0m1.431s > user 0m0.356s > sys 0m0.312s > > bash-3.00# time dtrace -s self.d -c "gzip zip" > dtrace: script ''self.d'' matched 56 probes > dtrace: pid 5638 has exited > > > real 0m1.254s > user 0m0.352s > sys 0m0.250s > > Compare this to running without any tracing: > bash-3.00# time gzip zip > > real 0m0.025s > user 0m0.012s > sys 0m0.010s > > Caching is 12% faster in this case, but both are 50x slower than when not being traced. Am I missing something here or is this expected behavior? Thanks for any insight anyone may have on this. > > Charlie > > > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Adam, Thanks for the reply. I can see why each probe entails much overhead, even if no action is taken. I''ll tell you what I''m trying to do and explain why I think there is no better way to do it and then maybe you can possibly point out some clever way to do what I''m trying to do that I hadn''t thought of ;) Here is the problem I''m trying to solve. Given two different, but similar, files input into the same program, identify the first instruction where execution differs when fed both of them. For example, consider the gzip example I benchmarked. If you fed in a gzipped file, and the same gzipped file with one bit in the middle flipped, on decompression of the first file, it will decompress properly and decompression of the second file will result in an error. If you could identify the first spot where execution differed, you would have located the routine responsible for identifying errors in the file. To solve this, simply requires "tracing" execution of each file and comparing which instructions are executed or functions are called. To come up with the first instruction, I figured, I would record the order of all the function calls and returns with a simple trace like: pid$target:library1::entry, ... pid$target:libraryn::entry, pid$target:a.out::entry / global_trace == 1 / { } where I don''t actually start tracing until after the file is opened, to reduce overhead. Since I don''t know ahead of time which instructions, functions, or even modules will be the one I''m looking for, I''m not sure how I can use less probes in this case. Charlie This message posted from opensolaris.org
Hey Charlie, One thing you could try is to use the stop() action to stop the process when it''s reached the event of interest and then use the system action to fire off another DTrace script that uses the more invasive set of probes. Of course, that''s only going to help if the region your interested if quites small compared to the rest of the program''s execution. Adam On Fri, Nov 17, 2006 at 11:56:24AM -0800, Charlie Miller wrote:> Adam, > > Thanks for the reply. I can see why each probe entails much overhead, even if no action is taken. I''ll tell you what I''m trying to do and explain why I think there is no better way to do it and then maybe you can possibly point out some clever way to do what I''m trying to do that I hadn''t thought of ;) > > Here is the problem I''m trying to solve. Given two different, but similar, files input into the same program, identify the first instruction where execution differs when fed both of them. For example, consider the gzip example I benchmarked. If you fed in a gzipped file, and the same gzipped file with one bit in the middle flipped, on decompression of the first file, it will decompress properly and decompression of the second file will result in an error. If you could identify the first spot where execution differed, you would have located the routine responsible for identifying errors in the file. To solve this, simply requires "tracing" execution of each file and comparing which instructions are executed or functions are called. > > To come up with the first instruction, I figured, I would record the order of all the function calls and returns with a simple trace like: > > pid$target:library1::entry, > ... > pid$target:libraryn::entry, > pid$target:a.out::entry > / global_trace == 1 / > { > } > > where I don''t actually start tracing until after the file is opened, to reduce overhead. Since I don''t know ahead of time which instructions, functions, or even modules will be the one I''m looking for, I''m not sure how I can use less probes in this case. > > Charlie > > > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Nicolas Williams
2006-Nov-17 20:27 UTC
[dtrace-discuss] Re: predicate caching ineffective?
On Fri, Nov 17, 2006 at 12:01:56PM -0800, Adam Leventhal wrote:> One thing you could try is to use the stop() action to stop the process when > it''s reached the event of interest and then use the system action to fire > off another DTrace script that uses the more invasive set of probes. Of > course, that''s only going to help if the region your interested if quites > small compared to the rest of the program''s execution.Yeah, have a DTrace script tracing read(2) syscalls from gunzip and stop() the process when it reads the data that you care about, then system() to start a new DTrace script to enable the pid provider probes that you''re interested in. This may make the whole process run faster. That said, it isn''t necessarily the case that the function containing the first branch that''s taken differently is the function where the error is detected. If you''re interested in where the error is detected then you might ustack() the process when some error function is called -- that''d be a single pid provider probe -- and then through code inspection walk back until you find other functions that you might want to trace, and go from there. Nico --
Charlie Miller
2006-Nov-17 20:34 UTC
[dtrace-discuss] Re: Re: predicate caching ineffective?
This is a good idea as it will at least make the part up until I need to actually start tracing run faster. However, I don''t think I can actually call a system inside a probe to launch a dtrace against the same process. When I try, I get the error: dtrace: failed to grab pid 2919: process is traced However, I just have the probe call stop() and exit() and then launch a second instance of dtrace with all the probes enabled, it does work. Thanks for the idea! This message posted from opensolaris.org
Charlie Miller
2006-Nov-17 20:37 UTC
[dtrace-discuss] Re: Re: predicate caching ineffective?
Nico, thanks for the comment. See inline:> Yeah, have a DTrace script tracing read(2) syscalls > from gunzip and > stop() the process when it reads the data that you > care about, then > system() to start a new DTrace script to enable the > pid provider probes > that you''re interested in. This may make the whole > process run faster. > > That said, it isn''t necessarily the case that the > function containing > the first branch that''s taken differently is the > function where the > error is detected. >Maybe not, but I hope it is. I can''t think of an example right now where it isn''t.> If you''re interested in where the error is detected > then you might > ustack() the process when some error function is > called -- that''d be a > single pid provider probe -- and then through code > inspection walk back > until you find other functions that you might want to > trace, and go from > there.I plan for this to be a more general purpose tool, the gzip was just an example. Therefore, I don''t want to suppose I know which are the error functions and I want to avoid code inspection.> > Nico > -- > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >This message posted from opensolaris.org
Adam Leventhal
2006-Nov-17 22:08 UTC
[dtrace-discuss] Re: Re: predicate caching ineffective?
You should be able to do it if you use ''pid2919'' as the provider rather than using ''-p 2919'' in your dtrace(1M) invocation. Adam On Fri, Nov 17, 2006 at 12:34:09PM -0800, Charlie Miller wrote:> This is a good idea as it will at least make the part up until I need to actually start tracing run faster. However, I don''t think I can actually call a system inside a probe to launch a dtrace against the same process. When I try, I get the error: > > dtrace: failed to grab pid 2919: process is traced > > However, I just have the probe call stop() and exit() and then launch a second instance of dtrace with all the probes enabled, it does work. Thanks for the idea! > > > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Peter Lawrence
2006-Nov-17 23:04 UTC
[dtrace-discuss] Re: Re: predicate caching ineffective?
Charlie Miller wrote On 11/17/06 12:37 PM,:> Nico, thanks for the comment. See inline: > > >>Yeah, have a DTrace script tracing read(2) syscalls >>from gunzip and >>stop() the process when it reads the data that you >>care about, then >>system() to start a new DTrace script to enable the >>pid provider probes >>that you''re interested in. This may make the whole >>process run faster. >> >>That said, it isn''t necessarily the case that the >>function containing >>the first branch that''s taken differently is the >>function where the >>error is detected. >> > > > Maybe not, but I hope it is. I can''t think of an example right now where it isn''t.possibly mistaken assumptions: single verse multi threaded uses the select syscall or not uses syscalls that can return -1 and an errno that means there wasn''t an error you just have to try again, or not. the program examines the filename, its date, or other fstat attributes that would be different between the files, or not. the list is always longer than you think, or as Haldane would say, longer than you can think... -Pete Lawrence.> > >>If you''re interested in where the error is detected >>then you might >>ustack() the process when some error function is >>called -- that''d be a >>single pid provider probe -- and then through code >>inspection walk back >>until you find other functions that you might want to >>trace, and go from >>there. > > > I plan for this to be a more general purpose tool, the gzip was just an example. Therefore, I don''t want to suppose I know which are the error functions and I want to avoid code inspection. > > >>Nico >>-- >>_______________________________________________ >>dtrace-discuss mailing list >>dtrace-discuss at opensolaris.org >> > > > > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
michael schuster
2006-Nov-20 08:54 UTC
[dtrace-discuss] Re: predicate caching ineffective?
On Fri, 17 Nov 2006 11:56:24 -0800 (PST) Charlie Miller <cmiller at securityevaluators.com> wrote:> Adam, > > Thanks for the reply. I can see why each probe entails much > overhead, even if no action is taken. I''ll tell you what I''m trying > to do and explain why I think there is no better way to do it and > then maybe you can possibly point out some clever way to do what I''m > trying to do that I hadn''t thought of ;)just a (probably stupid) question: there doesn''t seem to be a necessity to run this analysis night and day during production, but rather only once or twice on every data set, or maybe a few times. Is run time really such a concern under these circumstances? an idea: maybe speculations can help you narrow this down (this may not have much influence on the runtime, though). HTH -- Michael Schuster Sun Microsystems, Inc.
Charlie Miller
2006-Nov-20 14:02 UTC
[dtrace-discuss] Re: Re: predicate caching ineffective?
Not a stupid question at all. You''re right in the scenario I described. However, I had simplified things a bit. I''d really like to do this for a program evaluating an entire test suite. Therefore, since there may be thousands of test cases, time becomes important. This message posted from opensolaris.org