Chip.Bennett at exeloncorp.com
2012-May-22 20:04 UTC
[dtrace-discuss] Predicate vs. separate clause performance
Often a D program is easier to read if you break up a complex predicate into separate clauses, but I was wondering if you sacrifice script performance to do that. For example, the following two D programs do the same thing (Do a 20 second interval quantize of pread/pwrite block sizes for a specific Oracle DB instance). Clearly "ora1.d" is easier to read, but is "ora2.d" more efficient? # cat ora1.d #!/usr/sbin/dtrace -s /* Quantize IO blocksize for Oracle random I/O for specific DB instance. */ #pragma D option quiet inline string parmDBinst = $$1; syscall::pread*:entry, syscall::pwrite*:entry / execname == "oracle" / { this->cmd = strtok(curpsinfo->pr_psargs, " "); this->dbinst (substr(this->cmd,0,6) == "oracle") ? substr(this->cmd,6) : (substr(this->cmd,0,4) == "ora_" ? substr(this->cmd,9) : "noDBnoDB"); } syscall::pread*:entry, syscall::pwrite*:entry / this->dbinst == parmDBinst / { @IO[probefunc] = quantize(arg2); } tick-20s,END { printf ("%Y\n", walltimestamp); printa (@IO); clear (@IO); } # cat ora2.d #!/usr/sbin/dtrace -s /* Quantize IO blocksize for Oracle random I/O for specific DB instance. */ #pragma D option quiet inline string parmDBinst = $$1; syscall::pread*:entry, syscall::pwrite*:entry / execname == "oracle" && (this->cmd = strtok(curpsinfo->pr_psargs, " "), parmDBinst == ((substr(this->cmd,0,6) == "oracle") ? substr(this->cmd,6) : (substr(this->cmd,0,4) == "ora_" ? substr(this->cmd,9) : "noDBnoDB"))) / { @IO[probefunc] = quantize(arg2); } tick-20s,END { printf ("%Y\n", walltimestamp); printa (@IO); clear (@IO); } # Thanks, Chip Bennett ----------------------------------------- ************************************************** This e-mail and any of its attachments may contain Exelon Corporation proprietary information, which is privileged, confidential, or subject to copyright belonging to the Exelon Corporation family of Companies. This e-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this e-mail is strictly prohibited and may be unlawful. If you have received this e-mail in error, please notify the sender immediately and permanently delete the original and any copy of this e-mail and any printout. Thank You. ************************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20120522/8c6cf2cb/attachment.html>
Chad Mynhier
2012-May-23 15:45 UTC
[dtrace-discuss] Predicate vs. separate clause performance
On Tue, May 22, 2012 at 4:04 PM, <Chip.Bennett at exeloncorp.com> wrote:> Often a D program is easier to read if you break up a complex predicate into > separate clauses, but I was wondering if you sacrifice script performance to > do that.? For example, the following two D programs do the same thing (Do a > 20 second interval quantize of pread/pwrite block sizes for a specific > Oracle DB instance).? Clearly ?ora1.d? is easier to read, but is ?ora2.d? > more efficient? >What''s your goal here, to make the script run faster when it actually has work to do, or to minimize the impact of the script on the system overall? If you want to minimize the impact of the script, you can take advantage of predicate caching. (You can find this in the DTrace documentation in the chapter on performance considerations.) The predicate cache lets you shortcut a lot of processing in those cases where you know the predicate is going to fail. You''re getting some advantage from it already in ora1.d with the / execname ="oracle" / predicate. The first time a thread for some other process hits this predicate and fails, it will store the cache ID for this predicate. If the same thread hits the predicate again, the stored cache ID matches the predicate''s cache ID, and you kick out immediately rather than processing the predicate again. You could benefit further by making this->dbinst a thread-local variable. Given that you''re changing that variable every time you hit the first clause, though, you would keep invalidating the predicate cache. To maximize your use of the predicate cache, only set self->dbinst if it''s not already set (i.e., / execname == "oracle" && !self->dbinst /.) You''d also want to stop using the parmDBinst variable. Because it''s a global variable, any predicate using it is uncacheable. $$1 is cacheable, though, so you can just use that directly. Chad
Chip.Bennett at exeloncorp.com
2012-May-24 22:56 UTC
[dtrace-discuss] Predicate vs. separate clause performance
Chad, That was very helpful, thank-you. So it sounds like you''re saying that if the check expression has no cacheable components, it doesn''t matter if I put the check in the predicate, or if I break the logic into two clauses and put the check in the first clause, with a thread-local variable to trigger the second clause. Or at the least, I should try to break out the cacheable parts of the check expression, and include that in the predicate. So a macro, like $$1 is cacheable, but an inline constant isn''t? I guess the post-macro parsing of "D" sees $$1 the same as "abc", so I suppose that makes sense. But an inline constant is just as fixed as a quoted string, so I''d think it would be a good candidate for caching. I had a similar issue back in 2007 where you couldn''t use an inline string as the printf format. And I see that still hasn''t changed. Oh well. Chip Bennett Exelon Corporation BSC-IT Infrastructure & Operations UNIX Production Engineering 10 South Dearborn - 45th floor, Cube NW-008 Chicago, IL 60603 312-394-4245 direct / 312-394-7354 fax chip.bennett at exeloncorp.com -----Original Message----- From: Chad Mynhier [mailto:cmynhier at gmail.com] Sent: Wednesday, May 23, 2012 10:45 AM To: Bennett, Chip:(BSC) Cc: dtrace-discuss at opensolaris.org Subject: Re: [dtrace-discuss] Predicate vs. separate clause performance On Tue, May 22, 2012 at 4:04 PM, <Chip.Bennett at exeloncorp.com> wrote:> Often a D program is easier to read if you break up a complex > predicate into separate clauses, but I was wondering if you sacrifice > script performance to do that.? For example, the following two D > programs do the same thing (Do a > 20 second interval quantize of pread/pwrite block sizes for a specific > Oracle DB instance).? Clearly "ora1.d" is easier to read, but is "ora2.d" > more efficient? >What''s your goal here, to make the script run faster when it actually has work to do, or to minimize the impact of the script on the system overall? If you want to minimize the impact of the script, you can take advantage of predicate caching. (You can find this in the DTrace documentation in the chapter on performance considerations.) The predicate cache lets you shortcut a lot of processing in those cases where you know the predicate is going to fail. You''re getting some advantage from it already in ora1.d with the / execname == "oracle" / predicate. The first time a thread for some other process hits this predicate and fails, it will store the cache ID for this predicate. If the same thread hits the predicate again, the stored cache ID matches the predicate''s cache ID, and you kick out immediately rather than processing the predicate again. You could benefit further by making this->dbinst a thread-local variable. Given that you''re changing that variable every time you hit the first clause, though, you would keep invalidating the predicate cache. To maximize your use of the predicate cache, only set self->dbinst if it''s not already set (i.e., / execname == "oracle" && !self->dbinst /.) You''d also want to stop using the parmDBinst variable. Because it''s a global variable, any predicate using it is uncacheable. $$1 is cacheable, though, so you can just use that directly. Chad ----------------------------------------- ************************************************** This e-mail and any of its attachments may contain Exelon Corporation proprietary information, which is privileged, confidential, or subject to copyright belonging to the Exelon Corporation family of Companies. This e-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this e-mail is strictly prohibited and may be unlawful. If you have received this e-mail in error, please notify the sender immediately and permanently delete the original and any copy of this e-mail and any printout. Thank You. **************************************************
Chad Mynhier
2012-May-25 14:56 UTC
[dtrace-discuss] Predicate vs. separate clause performance
Responses inline: On Thu, May 24, 2012 at 6:56 PM, <Chip.Bennett at exeloncorp.com> wrote:> Chad, > > That was very helpful, thank-you. > > So it sounds like you''re saying that if the check expression has no cacheable components, it doesn''t matter if I put the check in the predicate, or if I break the logic into two clauses and put the check in the first clause, with a thread-local variable to trigger the second clause.Well, actually, you would see some benefit from breaking it into two clauses, if the predicate from the first clause is cacheable (like /execname == "oracle" /.) And you would see some benefit from the second clause if its predicate is also cacheable.> > Or at the least, I should try to break out the cacheable parts of the check expression, and include that in the predicate. > > So a macro, like $$1 is cacheable, but an inline constant isn''t? ?I guess the post-macro parsing of "D" sees $$1 the same as "abc", so I suppose that makes sense. ?But an inline constant is just as fixed as a quoted string, so I''d think it would be a good candidate for caching.But inlines aren''t necessarily constant values. For example, the following is a valid inline: inline string vtype = (this->vnode->v_type == VCHR) ? "CHR" : "BLK"; Chad
Jonathan Adams
2012-May-25 17:25 UTC
[dtrace-discuss] Predicate vs. separate clause performance
On Fri, May 25, 2012 at 10:56:31AM -0400, Chad Mynhier wrote:> Responses inline: > > On Thu, May 24, 2012 at 6:56 PM, <Chip.Bennett at exeloncorp.com> wrote: > > Chad, > > > > That was very helpful, thank-you. > > > > So it sounds like you''re saying that if the check expression has no > > cacheable components, it doesn''t matter if I put the check in the > > predicate, or if I break the logic into two clauses and put the check > > in the first clause, with a thread-local variable to trigger the second > > clause. > > Well, actually, you would see some benefit from breaking it into two > clauses, if the predicate from the first clause is cacheable (like > /execname == "oracle" /.) And you would see some benefit from the > second clause if its predicate is also cacheable.Also, remember that predicate caching only remembers the *last* cachable predicate the current thread evaluated, so it is mostly useful for predicates which will be hit more than once by a particular thread. For example, something like: fbt::: /self->foo/ { /* XXX do something */ } fbt::: /self->foo/ { /* XXX do something */ } is actually pessimal, since the predicate cache will always be out of date. (the fact that the two predicates are identical is not recognized; so each gets assigned a separate "predcache" ID). The logic in dtrace is: if (probe->predcache != DTRACE_CACHEIDNONE && probe->predcache == curthread->t_predcache) { /* fail immediately */ } ... evaluate predicate if (predicate is false) { if (probe->predcache != DTRACE_CACHEIDNONE) { curthread->t_predcache = probe->predcache; } continue; } In fact, any time you have more than one enabling for a probe with a predicate, the probe''s ''predcache'' will be DTRACE_CACHEIDNONE, since the two enablings cannot (as it stands) have the same predcacheid. There are three directions which could be explored to improve this: 1. Extend the predicate cache to actually compare the DIFO of active predicates, so that identical predicates get the same predcacheid. That would significantly improve the flexibility for consumers, since you can have different action statements for different probes. 2. Increase the predcache depth (allow multiple t_predcache IDs). This might not be worth it, since we check the cache on every probe firing. 3. We could investigate checking the predcache before evaluating the predicate, which could help for complicated predicates. But the benefit of the predcache is really that it drops out of dtrace processing as early as possible; this would have less of an effect. #1 seems like the biggest win to me. Cheers, - jonathan