thr3ads.net - dtrace discuss - [dtrace-discuss] USDT probe performance question [Dec 2006]

If this information is useful, please help other people find it:
Share via:

James McIlree

2006-Dec-21 10:47 UTC

[dtrace-discuss] USDT probe performance question

I have been tracing through the code for is_enabled()
USDT probes.

	From what I can see, an is_enabled() probe is initially
a "fake" call site.

	During dtrace -G processing, a 5 byte call instruction
is replaced with xor''s:

	/*
	 * Establish the instruction sequence -- all nops for probes, and an
	 * instruction to clear the return value register (%eax/%rax) followed
	 * by nops for is-enabled probes. For is-enabled probes, we advance
	 * the offset to the first nop. This isn''t stricly necessary but makes
	 * for more readable disassembly when the probe is enabled.
	 */

	if (!isenabled) {
		ip[0] = DT_OP_NOP;
		ip[1] = DT_OP_NOP;
		ip[2] = DT_OP_NOP;
		ip[3] = DT_OP_NOP;
		ip[4] = DT_OP_NOP;
	} else if (dtp->dt_oflags & DTRACE_O_LP64) {
		ip[0] = DT_OP_REX_RAX;
		ip[1] = DT_OP_XOR_EAX_0;
		ip[2] = DT_OP_XOR_EAX_1;
		ip[3] = DT_OP_NOP;
		ip[4] = DT_OP_NOP;
		(*off) += 3;
	} else {
		ip[0] = DT_OP_XOR_EAX_0;
		ip[1] = DT_OP_XOR_EAX_1;
		ip[2] = DT_OP_NOP;
		ip[3] = DT_OP_NOP;
		ip[4] = DT_OP_NOP;
		(*off) += 2;
	}

	So they effectively become:

	xor eax, eax
	nop
	nop

	Now, when you enable an "is enabled" probe, you write the regular
probe style trap into the first nop in the above sequence.

	Then, in the kernel, you overwrite the return register to have a value
of 1, return control flow to the program, and things proceed with  
altered
control flow.

	However, it looks like you could do the following:

	Replace the 5 byte call instruction with a 5 byte movl 0, eax.

	This encodes as:

	b8 00 00 00 00

	Now, if you set the "offset" for this instruction just past the b8,
whenever you "enable" that probe, it will write a trap over the  
immediate
data. The instruction will become

	movl non-zero-value, eax

	This should cause the enabled probe logic to switch "on", without a
trap into the kernel.

	This looks to me like it should work, it is atomic, it alters control  
flow,
you can still use the existing probe install code (with a very small  
change
for IS_ENABLED probes), and it cuts out a round trip to the kernel.

	Does this seem like a reasonable change?

	James M

P.S. One more question :-). Is there a reason you are clearing the rax  
register
in the 64 bit case? The 64 bit is_enabled function prototype is still  
for an int, correct?

Adam Leventhal

2006-Dec-21 16:50 UTC

head link

[dtrace-discuss] USDT probe performance question

To summarize James'' question: why does an is-enabled probe generate a 
''xor %eax, %eax'' instruction rather than a ''movl $0,
%eax'' since in the
enabling the probe in the first case requires a trap whereas the second
case would require only a change to the immediate value.

There are two answers to this. The less compelling of the two relates
to your second question of why we clear %rax rather than just %eax. We
do this since our experience with compilers hasn''t been altogether
positive in that compilers can use their "knowledge" of
"constraints"
to perform certain "optimizations". The fear was that some compiler
might
decide to perform comparisons on %rax rather than %eax. On the amd64 ISA,
we can''t clear a 8-byte register with a 4-byte immediate value so we
need
to use the ''xor %rax, %rax''. We didn''t want to create
a completely bifurcated
mechanism for 32- and 64-bit ISAs.

The much better reason is that stuffing a ''1'' into the
immediate value of
a ''movl'' would create a gigantic headache for the pid
provider. Note that
a user could trace that ''movl'' instruction a pid provider
offset probe. As
we''d now have two different mechanisms for instrumenting the exact same
text,
we''d need to be constantly checking for overlapping probes -- adjusting
the
notion of the original instruction if a pid provider probe was enabled after
the is-enabled probe, or modifying a live tracepoint if the is-enabled probe
was enabled second. By having both use the same tracing mechanism I was able
to introduce the is-enabled probe with relatively minor changes to the
tracepoint code in the pid provider. Using two alternate methods of
instrumentation doesn''t present an unsolvable problem, but
it''s very
complicated and the only benefit would be an improvement for the _enabled_
probe effect -- much much less important than the _disabled_ probe effect.

If anyone''s interested in more information on the implementation of the
pid
provider, you might check out my blog post on the subject:

  http://blogs.sun.com/ahl/entry/pid_provider_exposed

Note that it''s rather dense, and ultimately won''t make you a
better user
of DTrace, but if anyone makes it to the end I''ll consider updating it
to
include information on the relatively new is-enabled probes.

Adam

On Thu, Dec 21, 2006 at 02:47:13AM -0800, James McIlree
wrote:> 
> 	I have been tracing through the code for is_enabled()
> USDT probes.
> 
> 	From what I can see, an is_enabled() probe is initially
> a "fake" call site.
> 
> 	During dtrace -G processing, a 5 byte call instruction
> is replaced with xor''s:
> 
> 	/*
> 	 * Establish the instruction sequence -- all nops for probes, and an
> 	 * instruction to clear the return value register (%eax/%rax) 
> 	 followed
> 	 * by nops for is-enabled probes. For is-enabled probes, we advance
> 	 * the offset to the first nop. This isn''t stricly necessary but 
> 	 makes
> 	 * for more readable disassembly when the probe is enabled.
> 	 */
> 
> 	if (!isenabled) {
> 		ip[0] = DT_OP_NOP;
> 		ip[1] = DT_OP_NOP;
> 		ip[2] = DT_OP_NOP;
> 		ip[3] = DT_OP_NOP;
> 		ip[4] = DT_OP_NOP;
> 	} else if (dtp->dt_oflags & DTRACE_O_LP64) {
> 		ip[0] = DT_OP_REX_RAX;
> 		ip[1] = DT_OP_XOR_EAX_0;
> 		ip[2] = DT_OP_XOR_EAX_1;
> 		ip[3] = DT_OP_NOP;
> 		ip[4] = DT_OP_NOP;
> 		(*off) += 3;
> 	} else {
> 		ip[0] = DT_OP_XOR_EAX_0;
> 		ip[1] = DT_OP_XOR_EAX_1;
> 		ip[2] = DT_OP_NOP;
> 		ip[3] = DT_OP_NOP;
> 		ip[4] = DT_OP_NOP;
> 		(*off) += 2;
> 	}
> 
> 	So they effectively become:
> 
> 	xor eax, eax
> 	nop
> 	nop
> 
> 	Now, when you enable an "is enabled" probe, you write the
regular
> probe style trap into the first nop in the above sequence.
> 
> 	Then, in the kernel, you overwrite the return register to have a 
> 	value
> of 1, return control flow to the program, and things proceed with  
> altered
> control flow.
> 
> 	However, it looks like you could do the following:
> 
> 	Replace the 5 byte call instruction with a 5 byte movl 0, eax.
> 
> 	This encodes as:
> 
> 	b8 00 00 00 00
> 
> 	Now, if you set the "offset" for this instruction just past the
b8,
> whenever you "enable" that probe, it will write a trap over the  
> immediate
> data. The instruction will become
> 
> 	movl non-zero-value, eax
> 
> 	This should cause the enabled probe logic to switch "on",
without a
> trap into the kernel.
> 
> 	This looks to me like it should work, it is atomic, it alters 
> 	control  flow,
> you can still use the existing probe install code (with a very small  
> change
> for IS_ENABLED probes), and it cuts out a round trip to the kernel.
> 
> 	Does this seem like a reasonable change?
> 
> 	James M
> 
> P.S. One more question :-). Is there a reason you are clearing the rax  
> register
> in the 64 bit case? The 64 bit is_enabled function prototype is still  
> for an int, correct?
> 
> 
> 
> 
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
-- 
Adam Leventhal, Solaris Kernel Development       http://blogs.sun.com/ahl

dtrace discuss - Dec 2006 - USDT probe performance question

[dtrace-discuss] USDT probe performance question

[dtrace-discuss] USDT probe performance question