Roch Bourbonnais - Performance Engineering
2005-Sep-12 13:18 UTC
[dtrace-discuss] profile and fbt interaction
I am trying to profile within a given function myfunc So I use fbt entry and return points of myfunc to mark when a cpu is running it. From experience I know it does not usually migrate CPU during execution. myfunc:entry{traced[cpu] = 1;} myfunc:return{traced[cpu] = 0;} I then use the profile provider and only aggregate when I hit a traced CPU. profile-10ms/traced[cpu] != 0/ {@a[stack(20)] = count();} The aggregation had stacks I could not believe such as: dtrace`dtrace_probe+0xcf8 bogus+0x134c myfunc+0x404 ..... A disassembly of bogus showed that it''s last instruction is just before (in kernel test) the first instruction of myfunc. bogus+0x134c: sethi %hi(0), %i4 myfunc: ba,a -0x270f08 <dt=0x7e22> myfunc+4: sethi %hi(0x701d7000), %i5 So it looks that when the profile provider hits the first instruction of myfunc, and that instruction is instrumented, it reports a hit on ''myfunc-0x4''. Anyone seen this before ? -roch
On Mon, Sep 12, 2005 at 03:18:06PM +0200, Roch Bourbonnais - Performance Engineering wrote:> A disassembly of bogus showed that it''s last instruction is just before (in > kernel test) the first instruction of myfunc. > > bogus+0x134c: sethi %hi(0), %i4 > myfunc: ba,a -0x270f08 <dt=0x7e22> > myfunc+4: sethi %hi(0x701d7000), %i5 > > > So it looks that when the profile provider hits the first instruction of myfunc, and > that instruction is instrumented, it reports a hit on ''myfunc-0x4''.This is due to the way the instrumentation code tailcalls dtrace_probe(). To illustrate, I''ve enabled fbt::kmem_slab_alloc:entry. Here''s the disassembly:> kmem_slab_alloc::dis ! head -1kmem_slab_alloc: ba,a +0x2ea684 <dt=0x1af9>> kmem_slab_alloc+0x2ea684::dis... 0x143ffa8: save %sp, -0xb0, %sp 0x143ffac: sethi %hi(0x1800), %o0 0x143ffb0: or %o0, 0x2f9, %o0 0x143ffb4: mov %i0, %o1 0x143ffb8: mov %i1, %o2 0x143ffbc: sethi %hi(0x1155800), %g1 0x143ffc0: call +0x7a32f0c4 <dtrace_probe> 0x143ffc4: or %g1, 0x120, %o7 ... If you look at the stuff around the call, we set %o7 to 0x1155920. That''s:> 0x1155800+120=akmem_slab_destroy+0x74:> 0x1155800+120+4=akmem_slab_alloc: Presumably, the profile probe is firing after dtrace_probe() restores interrupts, but before it returns to %o7+8, so you see the bogus frame. Cheers, - jonathan -- Jonathan Adams, Solaris Kernel Development