thr3ads.net - dtrace discuss - [dtrace-discuss] Predicate vs. separate clause performance [May 2012]

If this information is useful, please help other people find it:
Share via:

Chip.Bennett at exeloncorp.com

2012-May-22 20:04 UTC

[dtrace-discuss] Predicate vs. separate clause performance

Often a D program is easier to read if you break up a complex predicate
into separate clauses, but I was wondering if you sacrifice script
performance to do that.  For example, the following two D programs do
the same thing (Do a 20 second interval quantize of pread/pwrite block
sizes for a specific Oracle DB instance).  Clearly "ora1.d" is easier
to
read, but is "ora2.d" more efficient?

 

# cat ora1.d

#!/usr/sbin/dtrace -s

 

/*

   Quantize IO blocksize for Oracle random I/O

   for specific DB instance.

*/

 

#pragma D option quiet

 

inline string parmDBinst = $$1;

 

syscall::pread*:entry,

syscall::pwrite*:entry

/ execname == "oracle" /

{

   this->cmd = strtok(curpsinfo->pr_psargs, " ");

   this->dbinst 
      (substr(this->cmd,0,6) == "oracle") ? substr(this->cmd,6)
:

      (substr(this->cmd,0,4) == "ora_" ? substr(this->cmd,9) :
"noDBnoDB");

}

 

syscall::pread*:entry,

syscall::pwrite*:entry

/ this->dbinst == parmDBinst /

{

   @IO[probefunc] = quantize(arg2);

}

 

tick-20s,END

{

   printf ("%Y\n", walltimestamp);

   printa (@IO);

   clear (@IO);

}

# cat ora2.d

#!/usr/sbin/dtrace -s

 

/*

   Quantize IO blocksize for Oracle random I/O

   for specific DB instance.

*/

 

#pragma D option quiet

 

inline string parmDBinst = $$1;

 

syscall::pread*:entry,

syscall::pwrite*:entry

/ execname == "oracle" &&

    (this->cmd = strtok(curpsinfo->pr_psargs, " "),

     parmDBinst == ((substr(this->cmd,0,6) == "oracle") ?
substr(this->cmd,6) :

        (substr(this->cmd,0,4) == "ora_" ? substr(this->cmd,9) :
"noDBnoDB"))) /

{

   @IO[probefunc] = quantize(arg2);

}

 

tick-20s,END

{

   printf ("%Y\n", walltimestamp);

   printa (@IO);

   clear (@IO);

}

#

 

Thanks,

Chip Bennett




-----------------------------------------
**************************************************
This e-mail and any of its attachments may contain Exelon
Corporation proprietary information, which is privileged,
confidential, or subject to copyright belonging to the Exelon
Corporation family of Companies. 
This e-mail is intended solely for the use of the individual or
entity to which it is addressed.  If you are not the intended
recipient of this e-mail, you are hereby notified that any
dissemination, distribution, copying, or action taken in relation
to the contents of and attachments to this e-mail is strictly
prohibited and may be unlawful.  If you have received this e-mail
in error, please notify the sender immediately and permanently
delete the original and any copy of this e-mail and any printout.
Thank You.
**************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20120522/8c6cf2cb/attachment.html>

Chad Mynhier

2012-May-23 15:45 UTC

head link

[dtrace-discuss] Predicate vs. separate clause performance

On Tue, May 22, 2012 at 4:04 PM,  <Chip.Bennett at exeloncorp.com>
wrote:> Often a D program is easier to read if you break up a complex predicate
into
> separate clauses, but I was wondering if you sacrifice script performance
to
> do that.? For example, the following two D programs do the same thing (Do a
> 20 second interval quantize of pread/pwrite block sizes for a specific
> Oracle DB instance).? Clearly ?ora1.d? is easier to read, but is ?ora2.d?
> more efficient?
>
What''s your goal here, to make the script run faster when it actually
has work to do, or to minimize the impact of the script on the system
overall?

If you want to minimize the impact of the script, you can take
advantage of predicate caching.  (You can find this in the DTrace
documentation in the chapter on performance considerations.)

The predicate cache lets you shortcut a lot of processing in those
cases where you know the predicate is going to fail.  You''re getting
some advantage from it already in ora1.d with the / execname ="oracle"
/ predicate.  The first time a thread for some other process
hits this predicate and fails, it will store the cache ID for this
predicate.  If the same thread hits the predicate again, the stored
cache ID matches the predicate''s cache ID, and you kick out
immediately rather than processing the predicate again.

You could benefit further by making this->dbinst a thread-local
variable.  Given that you''re changing that variable every time you hit
the first clause, though, you would keep invalidating the predicate
cache.  To maximize your use of the predicate cache, only set
self->dbinst if it''s not already set (i.e., / execname ==
"oracle" &&
!self->dbinst /.)

You''d also want to stop using the parmDBinst variable.  Because
it''s a
global variable, any predicate using it is uncacheable.  $$1 is
cacheable, though, so you can just use that directly.

Chad

Chip.Bennett at exeloncorp.com

2012-May-24 22:56 UTC

head link

[dtrace-discuss] Predicate vs. separate clause performance

Chad,

That was very helpful, thank-you.

So it sounds like you''re saying that if the check expression has no
cacheable components, it doesn''t matter if I put the check in the
predicate, or if I break the logic into two clauses and put the check in the
first clause, with a thread-local variable to trigger the second clause.

Or at the least, I should try to break out the cacheable parts of the check
expression, and include that in the predicate.

So a macro, like $$1 is cacheable, but an inline constant isn''t?  I
guess the post-macro parsing of "D" sees $$1 the same as
"abc", so I suppose that makes sense.  But an inline constant is just
as fixed as a quoted string, so I''d think it would be a good candidate
for caching.

I had a similar issue back in 2007 where you couldn''t use an inline
string as the printf format.  And I see that still hasn''t changed.  Oh
well.

Chip Bennett
Exelon Corporation
BSC-IT Infrastructure & Operations
UNIX Production Engineering
10 South Dearborn - 45th floor, Cube NW-008
Chicago, IL 60603
312-394-4245 direct / 312-394-7354 fax
chip.bennett at exeloncorp.com

-----Original Message-----
From: Chad Mynhier [mailto:cmynhier at gmail.com] 
Sent: Wednesday, May 23, 2012 10:45 AM
To: Bennett, Chip:(BSC)
Cc: dtrace-discuss at opensolaris.org
Subject: Re: [dtrace-discuss] Predicate vs. separate clause performance

On Tue, May 22, 2012 at 4:04 PM,  <Chip.Bennett at exeloncorp.com>
wrote:> Often a D program is easier to read if you break up a complex 
> predicate into separate clauses, but I was wondering if you sacrifice 
> script performance to do that.? For example, the following two D 
> programs do the same thing (Do a
> 20 second interval quantize of pread/pwrite block sizes for a specific 
> Oracle DB instance).? Clearly "ora1.d" is easier to read, but is
"ora2.d"
> more efficient?
>
What''s your goal here, to make the script run faster when it actually
has work to do, or to minimize the impact of the script on the system overall?

If you want to minimize the impact of the script, you can take advantage of
predicate caching.  (You can find this in the DTrace documentation in the
chapter on performance considerations.)

The predicate cache lets you shortcut a lot of processing in those cases where
you know the predicate is going to fail.  You''re getting some advantage
from it already in ora1.d with the / execname == "oracle" / predicate.
The first time a thread for some other process hits this predicate and fails, it
will store the cache ID for this predicate.  If the same thread hits the
predicate again, the stored cache ID matches the predicate''s cache ID,
and you kick out immediately rather than processing the predicate again.

You could benefit further by making this->dbinst a thread-local variable. 
Given that you''re changing that variable every time you hit the first
clause, though, you would keep invalidating the predicate cache.  To maximize
your use of the predicate cache, only set
self->dbinst if it''s not already set (i.e., / execname ==
"oracle" &&
!self->dbinst /.)

You''d also want to stop using the parmDBinst variable.  Because
it''s a global variable, any predicate using it is uncacheable.  $$1 is
cacheable, though, so you can just use that directly.

Chad

-----------------------------------------
**************************************************
This e-mail and any of its attachments may contain Exelon
Corporation proprietary information, which is privileged,
confidential, or subject to copyright belonging to the Exelon
Corporation family of Companies. 
This e-mail is intended solely for the use of the individual or
entity to which it is addressed.  If you are not the intended
recipient of this e-mail, you are hereby notified that any
dissemination, distribution, copying, or action taken in relation
to the contents of and attachments to this e-mail is strictly
prohibited and may be unlawful.  If you have received this e-mail
in error, please notify the sender immediately and permanently
delete the original and any copy of this e-mail and any printout.
Thank You.
**************************************************

Chad Mynhier

2012-May-25 14:56 UTC

head link

[dtrace-discuss] Predicate vs. separate clause performance

Responses inline:

On Thu, May 24, 2012 at 6:56 PM,  <Chip.Bennett at exeloncorp.com>
wrote:> Chad,
>
> That was very helpful, thank-you.
>
> So it sounds like you''re saying that if the check expression has
no cacheable components, it doesn''t matter if I put the check in the
predicate, or if I break the logic into two clauses and put the check in the
first clause, with a thread-local variable to trigger the second clause.
Well, actually, you would see some benefit from breaking it into two
clauses, if the predicate from the first clause is cacheable (like
/execname == "oracle" /.)  And you would see some benefit from the
second clause if its predicate is also cacheable.
>
> Or at the least, I should try to break out the cacheable parts of the check
expression, and include that in the predicate.
>
> So a macro, like $$1 is cacheable, but an inline constant isn''t?
?I guess the post-macro parsing of "D" sees $$1 the same as
"abc", so I suppose that makes sense. ?But an inline constant is just
as fixed as a quoted string, so I''d think it would be a good candidate
for caching.
But inlines aren''t necessarily constant values.  For example, the
following is a valid inline:

inline string vtype = (this->vnode->v_type == VCHR) ? "CHR" :
"BLK";

Chad

Jonathan Adams

2012-May-25 17:25 UTC

head link

[dtrace-discuss] Predicate vs. separate clause performance

On Fri, May 25, 2012 at 10:56:31AM -0400, Chad Mynhier
wrote:> Responses inline:
> 
> On Thu, May 24, 2012 at 6:56 PM,  <Chip.Bennett at exeloncorp.com>
wrote:
> > Chad,
> >
> > That was very helpful, thank-you.
> >
> > So it sounds like you''re saying that if the check expression
has no
> > cacheable components, it doesn''t matter if I put the check in
the
> > predicate, or if I break the logic into two clauses and put the check
> > in the first clause, with a thread-local variable to trigger the
second
> > clause.
> 
> Well, actually, you would see some benefit from breaking it into two
> clauses, if the predicate from the first clause is cacheable (like
> /execname == "oracle" /.)  And you would see some benefit from
the
> second clause if its predicate is also cacheable.
Also, remember that predicate caching only remembers the *last* cachable
predicate the current thread evaluated, so it is mostly useful for predicates
which will be hit more than once by a particular thread.

For example, something like:

fbt::: /self->foo/ {
	/* XXX do something */
}

fbt::: /self->foo/ {
	/* XXX do something */
}

is actually pessimal, since the predicate cache will always be out of date.
(the fact that the two predicates are identical is not recognized; so each
gets assigned a separate "predcache" ID).  The logic in dtrace is:

	if (probe->predcache != DTRACE_CACHEIDNONE &&
	    probe->predcache == curthread->t_predcache) {
		/* fail immediately */
	}

	...
	evaluate predicate
	if (predicate is false) {
		if (probe->predcache != DTRACE_CACHEIDNONE) {
			curthread->t_predcache = probe->predcache;
		}
		continue;
	}

In fact, any time you have more than one enabling for a probe with a predicate,
the probe''s ''predcache'' will be DTRACE_CACHEIDNONE,
since the two enablings
cannot (as it stands) have the same predcacheid.

There are three directions which could be explored to improve this:

	1. Extend the predicate cache to actually compare the DIFO of active
	   predicates, so that identical predicates get the same
	   predcacheid.  That would significantly improve the flexibility
	   for consumers, since you can have different action statements for
	   different probes.

	2. Increase the predcache depth (allow multiple t_predcache IDs).
	   This might not be worth it, since we check the cache on every
	   probe firing.

	3. We could investigate checking the predcache before evaluating the
	   predicate, which could help for complicated predicates.  But
	   the benefit of the predcache is really that it drops out of dtrace
	   processing as early as possible; this would have less of an effect.

#1 seems like the biggest win to me.

Cheers,
- jonathan

dtrace discuss - May 2012 - Predicate vs. separate clause performance

[dtrace-discuss] Predicate vs. separate clause performance

[dtrace-discuss] Predicate vs. separate clause performance

[dtrace-discuss] Predicate vs. separate clause performance

[dtrace-discuss] Predicate vs. separate clause performance

[dtrace-discuss] Predicate vs. separate clause performance