Hi! I want to see how the syscall instrumentation work in assembly level, so similar to this:> ufs_write::dis -n 3ufs_write: save %sp, -0x110, %sp ufs_write+4: stx %i4, [%sp + 0x8bf] ufs_write+8: mov %i0, %i5 ufs_write+0xc: ldx [%i0 + 0x10], %i4> ufs_write::dis -n 3ufs_write: ba,a +0x19814c <0x14c95dc> ufs_write+4: stx %i4, [%sp + 0x8bf] ufs_write+8: mov %i0, %i5 ufs_write+0xc: ldx [%i0 + 0x10], %i4> ufs_write+0x19814c::dis0x14c95b4: sethi %hi(0x1331000), %g1 0x14c95b8: call +0x79ebc0e8 <dtrace_probe> 0x14c95bc: or %g1, 0xc8, %o7 0x14c95c0: sethi %hi(0x4000), %o0 0x14c95c4: or %o0, 0x98, %o0 0x14c95c8: mov 0x300, %o1 0x14c95cc: call +0x79ebc0d4 <dtrace_probe> 0x14c95d0: mov %i0, %o2 0x14c95d4: ret 0x14c95d8: restore --- 0x14c95dc: save %sp, -0x110, %sp 0x14c95e0: sethi %hi(0x4000), %o0 0x14c95e4: or %o0, 0x99, %o0 0x14c95e8: mov %i0, %o1 0x14c95ec: mov %i1, %o2 0x14c95f0: mov %i2, %o3 0x14c95f4: mov %i3, %o4 0x14c95f8: mov %i4, %o5 0x14c95fc: sethi %hi(0x1331400), %g1 0x14c9600: call +0x79ebc0a0 <dtrace_probe> 0x14c9604: or %g1, 0x8c, %o7 So, to examine this, I wrote a program, which makes a system call: #include <unistd.h> int main(int argc, char *argv[]) { write(0,"helloworld\n",11); return 0; } So, I start to examing it with mdb: mdb ./syscall> main:b > :rmdb: stop at main mdb: target stopped at: main: save %sp, -0x68, %sp> .::dismain: save %sp, -0x68, %sp main+4: st %i0, [%fp + 0x44] main+8: st %i1, [%fp + 0x48] main+0xc: sethi %hi(0x10c00), %o1 main+0x10: or %o1, 0x90, %o1 main+0x14: clr %o0 main+0x18: call +0x100ac <PLT:write> main+0x1c: mov 0xb, %o2 main+0x20: clr [%fp - 0x4] main+0x24: clr %i0 main+0x28: ret main+0x2c: restore main+0x30: clr %i0 main+0x34: ret main+0x38: restore Okay, the syscall is there, dtrace instuments it, if I turn on the syscall::write:entry probe. When I try to examing write itself I get the same results in instrumented and non-instrumented case (I followed the brances, it is the same after that too):> main+0x100ac::disPLT:exit: sethi %hi(0xf000), %g1 PLT:exit: ba,a -0x40 <PLT:> PLT:exit: nop PLT:_exit: sethi %hi(0x12000), %g1 PLT:_exit: ba,a -0x4c <PLT:> PLT:_exit: nop PLT:write: sethi %hi(0x15000), %g1 PLT:write: ba,a -0x58 <PLT:> PLT:write: nop PLT:_get_exit_frame_monitor: sethi %hi(0x18000), %g1 PLT:_get_exit_frame_monitor: ba,a -0x64 <PLT:> I tried to ::step the program through the instrumentation, but when the probe is on, it conseqently crashes at one instruction (with this, at some point I should run into dtrace_probe). How can I see the effect of system call instrumentation at assembly level? Maybe it would be easier if I could compile a static binary. I am using nevada build 56 on sparc. Peter
Hi Peter,> Hi! > > I want to see how the syscall instrumentation work in assembly level, so > similar to this:You''re looking in the wrong place - you need to look at how the syscall provider works for that. DTrace instruments system calls by (basically) modifying the function pointer in the sysent table. For example, with a write(2) call, instead of calling the write() function directly via the sysent table, we first modify the original entry to call into dtrace_systrace_syscall() function. From there we then call dtrace_probe() (which is the centre of the universe w.r.t DTrace) and also carry out the original call itself (which is stored in the systrace_sysent array). You can follow along by referencing these two source files: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/sysent.c http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/dtrace/systrace.c Here''s an example of how write(2) system call is instrumented: 1) The sysent table entry before instrumentation: > sysent::array struct sysent 10 | ::print struct sysent [chop] { sy_narg = ''\003'' sy_flags = 0x2 sy_call = 0 sy_lock = 0 sy_callc = write } 2) Execute `dtrace -n syscall::write:entry` 3) The sysent table entry when instrumented: [chop] { sy_narg = ''\003'' sy_flags = 0x2 sy_call = 0 sy_lock = 0 sy_callc = dtrace_systrace_syscall } 4) Here''s what the systrace_sysent entry looks like for the write() call: > *systrace_sysent::array struct systrace_sysent 10 | ::print -t struct systrace _sysent [chop] { dtrace_id_t stsy_entry = 0xb263 dtrace_id_t stsy_return = 0 int (*)() stsy_underlying = write } The example you give below is slightly confused as it uses the fbt provider to instrument the ufs_write() entry point and then it looks at a user-land application to inspect the instrumentation. If you want to look at how user-land instrumentation is done then you''d need to use the pid provider (Adam has several excellent presentations and blog entries covering that which go into enough detail to give you a nose bleed). If you have it you might want to take a quick look at the DTrace introduction chapter in McDougall, Mauro and Gregg''s excellent Performance, Observability and Debugging book (the second volume in the second edition of Solaris Internals). It aims to introduce give a quick overview of how DTrace is put together. Obviously, the DTrace user guide should be read as well. Cheers. Jon.>> ufs_write::dis -n 3 >> > ufs_write: save %sp, -0x110, %sp > ufs_write+4: stx %i4, [%sp + 0x8bf] > ufs_write+8: mov %i0, %i5 > ufs_write+0xc: ldx [%i0 + 0x10], %i4 > > >> ufs_write::dis -n 3 >> > ufs_write: ba,a +0x19814c <0x14c95dc> > ufs_write+4: stx %i4, [%sp + 0x8bf] > ufs_write+8: mov %i0, %i5 > ufs_write+0xc: ldx [%i0 + 0x10], %i4 > > >> ufs_write+0x19814c::dis >> > 0x14c95b4: sethi %hi(0x1331000), %g1 > 0x14c95b8: call +0x79ebc0e8 <dtrace_probe> > 0x14c95bc: or %g1, 0xc8, %o7 > 0x14c95c0: sethi %hi(0x4000), %o0 > 0x14c95c4: or %o0, 0x98, %o0 > 0x14c95c8: mov 0x300, %o1 > 0x14c95cc: call +0x79ebc0d4 <dtrace_probe> > 0x14c95d0: mov %i0, %o2 > 0x14c95d4: ret > 0x14c95d8: restore > --- > 0x14c95dc: save %sp, -0x110, %sp > 0x14c95e0: sethi %hi(0x4000), %o0 > 0x14c95e4: or %o0, 0x99, %o0 > 0x14c95e8: mov %i0, %o1 > 0x14c95ec: mov %i1, %o2 > 0x14c95f0: mov %i2, %o3 > 0x14c95f4: mov %i3, %o4 > 0x14c95f8: mov %i4, %o5 > 0x14c95fc: sethi %hi(0x1331400), %g1 > 0x14c9600: call +0x79ebc0a0 <dtrace_probe> > 0x14c9604: or %g1, 0x8c, %o7 > > So, to examine this, I wrote a program, which makes a system call: > #include <unistd.h> > int main(int argc, char *argv[]) { > write(0,"helloworld\n",11); > return 0; > } > > So, I start to examing it with mdb: > mdb ./syscall > >> main:b >> :r >> > mdb: stop at main > mdb: target stopped at: > main: save %sp, -0x68, %sp > >> .::dis >> > main: save %sp, -0x68, %sp > main+4: st %i0, [%fp + 0x44] > main+8: st %i1, [%fp + 0x48] > main+0xc: sethi %hi(0x10c00), %o1 > main+0x10: or %o1, 0x90, %o1 > main+0x14: clr %o0 > main+0x18: call +0x100ac <PLT:write> > main+0x1c: mov 0xb, %o2 > main+0x20: clr [%fp - 0x4] > main+0x24: clr %i0 > main+0x28: ret > main+0x2c: restore > main+0x30: clr %i0 > main+0x34: ret > main+0x38: restore > > Okay, the syscall is there, dtrace instuments it, if I turn on the > syscall::write:entry probe. > > When I try to examing write itself I get the same results in > instrumented and non-instrumented case (I followed the brances, it is > the same after that too): > >> main+0x100ac::dis >> > PLT:exit: sethi %hi(0xf000), %g1 > PLT:exit: ba,a -0x40 <PLT:> > PLT:exit: nop > PLT:_exit: sethi %hi(0x12000), %g1 > PLT:_exit: ba,a -0x4c <PLT:> > PLT:_exit: nop > PLT:write: sethi %hi(0x15000), %g1 > PLT:write: ba,a -0x58 <PLT:> > PLT:write: nop > PLT:_get_exit_frame_monitor: sethi %hi(0x18000), %g1 > PLT:_get_exit_frame_monitor: ba,a -0x64 <PLT:> > > I tried to ::step the program through the instrumentation, but when the > probe is on, it conseqently crashes at one instruction (with this, at > some point I should run into dtrace_probe). > > How can I see the effect of system call instrumentation at assembly > level? Maybe it would be easier if I could compile a static binary. I am > using nevada build 56 on sparc. > > Peter > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
Frank Hofmann
2007-Feb-22 10:17 UTC
[dtrace-discuss] Re: [mdb-discuss] examine dtrace behaviour with mdb
In addition to what Jon Haslam already said, "instrumentation" in DTrace is many things. Not all DTrace providers enable probes by putting jump or trap instructions into application/kernel code at the probepoints. The syscall provider is one that doesn''t. Neither application nor kernel code is "instrumented" when you enable a syscall probe - instead, as Jon showed, the kernel''s system call dispatch table is modified, with a bounce to the dtrace syscall provider in the slots that you''re probing. syscalls in Solaris are in a function call table (sysent). An application, making a system call, ends up executing a trap instruction (ta 0x8 on SPARC, int/sysenter/syscall/lcall on x86 depending on CPU type) with one CPU register containing the system call number (the list of which you find in <sys/syscalls.h>). The trap handler simply checks this for validity (within the range that''s defined) and then gets the function pointer to call by indexing that table. This is how syscalls get from userland into the kernel - they cause a trap, which is a privilege switching event. Why are you not seeing those trap instructions in your app''s code ? Because they''re in libc only. The app is not allowed to care how exactly a system call is done - it calls the libc function, via the procedure linkage table that ld.so fills in when loading/linking the app. Try the following to see this stuff: 1. Compile + Link your test program 2. load it into mdb but do not run it yet. 3. disassemble main(), find the PLT:... entries 4. put a breakpoint at main (main::bp does it in mdb) 5. run the program 6. when it hits the breakpoint, disassemble it again You''ll find that the PLT:... entries have been replaced by the actual libc function entry points. That''s the linker''s work. If you disassemble those libc funcs, you''ll then find the actual ''syscall'' instruction (on amd64 it''s indeed ''syscall''). Bye, FrankH. On Thu, 22 Feb 2007, Peter Boros wrote:> Hi! > > I want to see how the syscall instrumentation work in assembly level, so > similar to this: > >> ufs_write::dis -n 3 > ufs_write: save %sp, -0x110, %sp > ufs_write+4: stx %i4, [%sp + 0x8bf] > ufs_write+8: mov %i0, %i5 > ufs_write+0xc: ldx [%i0 + 0x10], %i4 > >> ufs_write::dis -n 3 > ufs_write: ba,a +0x19814c <0x14c95dc> > ufs_write+4: stx %i4, [%sp + 0x8bf] > ufs_write+8: mov %i0, %i5 > ufs_write+0xc: ldx [%i0 + 0x10], %i4 > >> ufs_write+0x19814c::dis > 0x14c95b4: sethi %hi(0x1331000), %g1 > 0x14c95b8: call +0x79ebc0e8 <dtrace_probe> > 0x14c95bc: or %g1, 0xc8, %o7 > 0x14c95c0: sethi %hi(0x4000), %o0 > 0x14c95c4: or %o0, 0x98, %o0 > 0x14c95c8: mov 0x300, %o1 > 0x14c95cc: call +0x79ebc0d4 <dtrace_probe> > 0x14c95d0: mov %i0, %o2 > 0x14c95d4: ret > 0x14c95d8: restore > --- > 0x14c95dc: save %sp, -0x110, %sp > 0x14c95e0: sethi %hi(0x4000), %o0 > 0x14c95e4: or %o0, 0x99, %o0 > 0x14c95e8: mov %i0, %o1 > 0x14c95ec: mov %i1, %o2 > 0x14c95f0: mov %i2, %o3 > 0x14c95f4: mov %i3, %o4 > 0x14c95f8: mov %i4, %o5 > 0x14c95fc: sethi %hi(0x1331400), %g1 > 0x14c9600: call +0x79ebc0a0 <dtrace_probe> > 0x14c9604: or %g1, 0x8c, %o7 > > So, to examine this, I wrote a program, which makes a system call: > #include <unistd.h> > int main(int argc, char *argv[]) { > write(0,"helloworld\n",11); > return 0; > } > > So, I start to examing it with mdb: > mdb ./syscall >> main:b >> :r > mdb: stop at main > mdb: target stopped at: > main: save %sp, -0x68, %sp >> .::dis > main: save %sp, -0x68, %sp > main+4: st %i0, [%fp + 0x44] > main+8: st %i1, [%fp + 0x48] > main+0xc: sethi %hi(0x10c00), %o1 > main+0x10: or %o1, 0x90, %o1 > main+0x14: clr %o0 > main+0x18: call +0x100ac <PLT:write> > main+0x1c: mov 0xb, %o2 > main+0x20: clr [%fp - 0x4] > main+0x24: clr %i0 > main+0x28: ret > main+0x2c: restore > main+0x30: clr %i0 > main+0x34: ret > main+0x38: restore > > Okay, the syscall is there, dtrace instuments it, if I turn on the > syscall::write:entry probe. > > When I try to examing write itself I get the same results in > instrumented and non-instrumented case (I followed the brances, it is > the same after that too): >> main+0x100ac::dis > PLT:exit: sethi %hi(0xf000), %g1 > PLT:exit: ba,a -0x40 <PLT:> > PLT:exit: nop > PLT:_exit: sethi %hi(0x12000), %g1 > PLT:_exit: ba,a -0x4c <PLT:> > PLT:_exit: nop > PLT:write: sethi %hi(0x15000), %g1 > PLT:write: ba,a -0x58 <PLT:> > PLT:write: nop > PLT:_get_exit_frame_monitor: sethi %hi(0x18000), %g1 > PLT:_get_exit_frame_monitor: ba,a -0x64 <PLT:> > > I tried to ::step the program through the instrumentation, but when the > probe is on, it conseqently crashes at one instruction (with this, at > some point I should run into dtrace_probe). > > How can I see the effect of system call instrumentation at assembly > level? Maybe it would be easier if I could compile a static binary. I am > using nevada build 56 on sparc. > > Peter > > _______________________________________________ > mdb-discuss mailing list > mdb-discuss at opensolaris.org >
Oliver Yang
2007-Feb-23 07:09 UTC
[dtrace-discuss] Re: [mdb-discuss] examine dtrace behaviour with mdb
Frank Hofmann wrote:> Why are you not seeing those trap instructions in your app''s code ? > Because they''re in libc only. The app is not allowed to care how > exactly a system call is done - it calls the libc function, via the > procedure linkage table that ld.so fills in when loading/linking the > app. Try the following to see this stuff: > > 1. Compile + Link your test program > 2. load it into mdb but do not run it yet. > 3. disassemble main(), find the PLT:... entries > 4. put a breakpoint at main (main::bp does it in mdb) > 5. run the program > 6. when it hits the breakpoint, disassemble it again > > You''ll find that the PLT:... entries have been replaced by the actual > libc function entry points. That''s the linker''s work.This shouldn''t work, if you just set breakpoint at main. Because it''s a dynamic binding, the runtime linker get the real address while the function is called at first time, if you really want to observe it, you should set a breakpoint before "ld.so.1`elf_rtbndr" returned, then run it, and disassemble main again.> > If you disassemble those libc funcs, you''ll then find the actual > ''syscall'' instruction (on amd64 it''s indeed ''syscall''). > > Bye, > FrankH. > > On Thu, 22 Feb 2007, Peter Boros wrote: > >> Hi! >> >> I want to see how the syscall instrumentation work in assembly level, so >> similar to this: >> >>> ufs_write::dis -n 3 >> ufs_write: save %sp, -0x110, %sp >> ufs_write+4: stx %i4, [%sp + 0x8bf] >> ufs_write+8: mov %i0, %i5 >> ufs_write+0xc: ldx [%i0 + 0x10], %i4 >> >>> ufs_write::dis -n 3 >> ufs_write: ba,a +0x19814c <0x14c95dc> >> ufs_write+4: stx %i4, [%sp + 0x8bf] >> ufs_write+8: mov %i0, %i5 >> ufs_write+0xc: ldx [%i0 + 0x10], %i4 >> >>> ufs_write+0x19814c::dis >> 0x14c95b4: sethi %hi(0x1331000), %g1 >> 0x14c95b8: call +0x79ebc0e8 <dtrace_probe> >> 0x14c95bc: or %g1, 0xc8, %o7 >> 0x14c95c0: sethi %hi(0x4000), %o0 >> 0x14c95c4: or %o0, 0x98, %o0 >> 0x14c95c8: mov 0x300, %o1 >> 0x14c95cc: call +0x79ebc0d4 <dtrace_probe> >> 0x14c95d0: mov %i0, %o2 >> 0x14c95d4: ret >> 0x14c95d8: restore >> --- >> 0x14c95dc: save %sp, -0x110, %sp >> 0x14c95e0: sethi %hi(0x4000), %o0 >> 0x14c95e4: or %o0, 0x99, %o0 >> 0x14c95e8: mov %i0, %o1 >> 0x14c95ec: mov %i1, %o2 >> 0x14c95f0: mov %i2, %o3 >> 0x14c95f4: mov %i3, %o4 >> 0x14c95f8: mov %i4, %o5 >> 0x14c95fc: sethi %hi(0x1331400), %g1 >> 0x14c9600: call +0x79ebc0a0 <dtrace_probe> >> 0x14c9604: or %g1, 0x8c, %o7 >> >> So, to examine this, I wrote a program, which makes a system call: >> #include <unistd.h> >> int main(int argc, char *argv[]) { >> write(0,"helloworld\n",11); >> return 0; >> } >> >> So, I start to examing it with mdb: >> mdb ./syscall >>> main:b >>> :r >> mdb: stop at main >> mdb: target stopped at: >> main: save %sp, -0x68, %sp >>> .::dis >> main: save %sp, -0x68, %sp >> main+4: st %i0, [%fp + 0x44] >> main+8: st %i1, [%fp + 0x48] >> main+0xc: sethi %hi(0x10c00), %o1 >> main+0x10: or %o1, 0x90, %o1 >> main+0x14: clr %o0 >> main+0x18: call +0x100ac <PLT:write> >> main+0x1c: mov 0xb, %o2 >> main+0x20: clr [%fp - 0x4] >> main+0x24: clr %i0 >> main+0x28: ret >> main+0x2c: restore >> main+0x30: clr %i0 >> main+0x34: ret >> main+0x38: restore >> >> Okay, the syscall is there, dtrace instuments it, if I turn on the >> syscall::write:entry probe. >> >> When I try to examing write itself I get the same results in >> instrumented and non-instrumented case (I followed the brances, it is >> the same after that too): >>> main+0x100ac::dis >> PLT:exit: sethi %hi(0xf000), %g1 >> PLT:exit: ba,a -0x40 <PLT:> >> PLT:exit: nop >> PLT:_exit: sethi %hi(0x12000), %g1 >> PLT:_exit: ba,a -0x4c <PLT:> >> PLT:_exit: nop >> PLT:write: sethi %hi(0x15000), %g1 >> PLT:write: ba,a -0x58 <PLT:> >> PLT:write: nop >> PLT:_get_exit_frame_monitor: sethi %hi(0x18000), %g1 >> PLT:_get_exit_frame_monitor: ba,a -0x64 <PLT:> >> >> I tried to ::step the program through the instrumentation, but when the >> probe is on, it conseqently crashes at one instruction (with this, at >> some point I should run into dtrace_probe). >> >> How can I see the effect of system call instrumentation at assembly >> level? Maybe it would be easier if I could compile a static binary. I am >> using nevada build 56 on sparc. >> >> Peter >> >> _______________________________________________ >> mdb-discuss mailing list >> mdb-discuss at opensolaris.org >> > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Peter Boros
2007-Feb-26 20:09 UTC
[dtrace-discuss] Re: [mdb-discuss] examine dtrace behaviour with mdb
Hi! I tried this, and I see your point. What I got so far: -bash-3.00# mdb ./syscall> main:b > :rmdb: stop at main mdb: target stopped at: main: save %sp, -0x68, %sp> ld.so.1`elf_rtbndr::disld.so.1`elf_rtbndr: mov %i7, %o0 ld.so.1`elf_rtbndr+4: save %sp, -0x60, %sp ld.so.1`elf_rtbndr+8: srl %g1, 0xa, %o1 ld.so.1`elf_rtbndr+0xc: ld [%i7 + 0x8], %o0 ld.so.1`elf_rtbndr+0x10: call +0x19f20 <ld.so.1`elf_bndr> ld.so.1`elf_rtbndr+0x14: mov %i0, %o2 ld.so.1`elf_rtbndr+0x18: mov %o0, %g1 ld.so.1`elf_rtbndr+0x1c: restore ld.so.1`elf_rtbndr+0x20: jmp %g1 ld.so.1`elf_rtbndr+0x24: restore> ld.so.1`elf_rtbndr+0x24:b > :cmdb: stop at ld.so.1`elf_rtbndr+0x24 mdb: target stopped at: ld.so.1`elf_rtbndr+0x24:restore> main::dismain: save %sp, -0x68, %sp main+4: st %i0, [%fp + 0x44] main+8: st %i1, [%fp + 0x48] main+0xc: sethi %hi(0x10c00), %o1 main+0x10: or %o1, 0x90, %o1 main+0x14: clr %o0 main+0x18: call +0x100ac <PLT=libc.so.1`write> main+0x1c: mov 0xb, %o2 main+0x20: clr [%fp - 0x4] main+0x24: clr %i0 main+0x28: ret main+0x2c: restore main+0x30: clr %i0 main+0x34: ret main+0x38: restore There is indeed the libc system call. If I disassemble it:> libc.so.1`write::dis -n 1libc.so.1`write: save %sp, -0x60, %sp libc.so.1`write+4: ldub [%g7 + 0xdf], %l7 With ::next dcmds I can make the probe fire, but I can''t see instrumentation. At some point I should see dtrace_systrace_syscall in main right? Peter On Fri, 2007-02-23 at 15:09 +0800, Oliver Yang wrote:> Frank Hofmann wrote: > > Why are you not seeing those trap instructions in your app''s code ? > > Because they''re in libc only. The app is not allowed to care how > > exactly a system call is done - it calls the libc function, via the > > procedure linkage table that ld.so fills in when loading/linking the > > app. Try the following to see this stuff: > > > > 1. Compile + Link your test program > > 2. load it into mdb but do not run it yet. > > 3. disassemble main(), find the PLT:... entries > > 4. put a breakpoint at main (main::bp does it in mdb) > > 5. run the program > > 6. when it hits the breakpoint, disassemble it again > > > > You''ll find that the PLT:... entries have been replaced by the actual > > libc function entry points. That''s the linker''s work. > This shouldn''t work, if you just set breakpoint at main. Because it''s a > dynamic binding, the runtime linker get the real address > while the function is called at first time, if you really want to > observe it, you should set a breakpoint before "ld.so.1`elf_rtbndr" > returned, > then run it, and disassemble main again. > > > > > If you disassemble those libc funcs, you''ll then find the actual > > ''syscall'' instruction (on amd64 it''s indeed ''syscall''). > > > > Bye, > > FrankH. > > > > On Thu, 22 Feb 2007, Peter Boros wrote: > > > >> Hi! > >> > >> I want to see how the syscall instrumentation work in assembly level, so > >> similar to this: > >> > >>> ufs_write::dis -n 3 > >> ufs_write: save %sp, -0x110, %sp > >> ufs_write+4: stx %i4, [%sp + 0x8bf] > >> ufs_write+8: mov %i0, %i5 > >> ufs_write+0xc: ldx [%i0 + 0x10], %i4 > >> > >>> ufs_write::dis -n 3 > >> ufs_write: ba,a +0x19814c <0x14c95dc> > >> ufs_write+4: stx %i4, [%sp + 0x8bf] > >> ufs_write+8: mov %i0, %i5 > >> ufs_write+0xc: ldx [%i0 + 0x10], %i4 > >> > >>> ufs_write+0x19814c::dis > >> 0x14c95b4: sethi %hi(0x1331000), %g1 > >> 0x14c95b8: call +0x79ebc0e8 <dtrace_probe> > >> 0x14c95bc: or %g1, 0xc8, %o7 > >> 0x14c95c0: sethi %hi(0x4000), %o0 > >> 0x14c95c4: or %o0, 0x98, %o0 > >> 0x14c95c8: mov 0x300, %o1 > >> 0x14c95cc: call +0x79ebc0d4 <dtrace_probe> > >> 0x14c95d0: mov %i0, %o2 > >> 0x14c95d4: ret > >> 0x14c95d8: restore > >> --- > >> 0x14c95dc: save %sp, -0x110, %sp > >> 0x14c95e0: sethi %hi(0x4000), %o0 > >> 0x14c95e4: or %o0, 0x99, %o0 > >> 0x14c95e8: mov %i0, %o1 > >> 0x14c95ec: mov %i1, %o2 > >> 0x14c95f0: mov %i2, %o3 > >> 0x14c95f4: mov %i3, %o4 > >> 0x14c95f8: mov %i4, %o5 > >> 0x14c95fc: sethi %hi(0x1331400), %g1 > >> 0x14c9600: call +0x79ebc0a0 <dtrace_probe> > >> 0x14c9604: or %g1, 0x8c, %o7 > >> > >> So, to examine this, I wrote a program, which makes a system call: > >> #include <unistd.h> > >> int main(int argc, char *argv[]) { > >> write(0,"helloworld\n",11); > >> return 0; > >> } > >> > >> So, I start to examing it with mdb: > >> mdb ./syscall > >>> main:b > >>> :r > >> mdb: stop at main > >> mdb: target stopped at: > >> main: save %sp, -0x68, %sp > >>> .::dis > >> main: save %sp, -0x68, %sp > >> main+4: st %i0, [%fp + 0x44] > >> main+8: st %i1, [%fp + 0x48] > >> main+0xc: sethi %hi(0x10c00), %o1 > >> main+0x10: or %o1, 0x90, %o1 > >> main+0x14: clr %o0 > >> main+0x18: call +0x100ac <PLT:write> > >> main+0x1c: mov 0xb, %o2 > >> main+0x20: clr [%fp - 0x4] > >> main+0x24: clr %i0 > >> main+0x28: ret > >> main+0x2c: restore > >> main+0x30: clr %i0 > >> main+0x34: ret > >> main+0x38: restore > >> > >> Okay, the syscall is there, dtrace instuments it, if I turn on the > >> syscall::write:entry probe. > >> > >> When I try to examing write itself I get the same results in > >> instrumented and non-instrumented case (I followed the brances, it is > >> the same after that too): > >>> main+0x100ac::dis > >> PLT:exit: sethi %hi(0xf000), %g1 > >> PLT:exit: ba,a -0x40 <PLT:> > >> PLT:exit: nop > >> PLT:_exit: sethi %hi(0x12000), %g1 > >> PLT:_exit: ba,a -0x4c <PLT:> > >> PLT:_exit: nop > >> PLT:write: sethi %hi(0x15000), %g1 > >> PLT:write: ba,a -0x58 <PLT:> > >> PLT:write: nop > >> PLT:_get_exit_frame_monitor: sethi %hi(0x18000), %g1 > >> PLT:_get_exit_frame_monitor: ba,a -0x64 <PLT:> > >> > >> I tried to ::step the program through the instrumentation, but when the > >> probe is on, it conseqently crashes at one instruction (with this, at > >> some point I should run into dtrace_probe). > >> > >> How can I see the effect of system call instrumentation at assembly > >> level? Maybe it would be easier if I could compile a static binary. I am > >> using nevada build 56 on sparc. > >> > >> Peter > >> > >> _______________________________________________ > >> mdb-discuss mailing list > >> mdb-discuss at opensolaris.org > >> > > _______________________________________________ > > dtrace-discuss mailing list > > dtrace-discuss at opensolaris.org > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Adam Leventhal
2007-Feb-26 23:55 UTC
[dtrace-discuss] Re: [mdb-discuss] examine dtrace behaviour with mdb
> With ::next dcmds I can make the probe fire, but I can''t see > instrumentation. At some point I should see dtrace_systrace_syscall in > main right?No. You''re examining the user-land address space of a process and the instrumentation for the syscall provider happens entirely in the kernel. You can use mdb -k or kmdb to observe that. You can observe user-land instrumentation with the pid provider or USDT providers using mdb on the process. Adam On Mon, Feb 26, 2007 at 09:09:22PM +0100, Peter Boros wrote:> Hi! > > I tried this, and I see your point. What I got so far: > -bash-3.00# mdb ./syscall > > main:b > > :r > mdb: stop at main > mdb: target stopped at: > main: save %sp, -0x68, %sp > > ld.so.1`elf_rtbndr::dis > ld.so.1`elf_rtbndr: mov %i7, %o0 > ld.so.1`elf_rtbndr+4: save %sp, -0x60, %sp > ld.so.1`elf_rtbndr+8: srl %g1, 0xa, %o1 > ld.so.1`elf_rtbndr+0xc: ld [%i7 + 0x8], %o0 > ld.so.1`elf_rtbndr+0x10: call +0x19f20 > <ld.so.1`elf_bndr> > ld.so.1`elf_rtbndr+0x14: mov %i0, %o2 > ld.so.1`elf_rtbndr+0x18: mov %o0, %g1 > ld.so.1`elf_rtbndr+0x1c: restore > ld.so.1`elf_rtbndr+0x20: jmp %g1 > ld.so.1`elf_rtbndr+0x24: restore > > ld.so.1`elf_rtbndr+0x24:b > > :c > mdb: stop at ld.so.1`elf_rtbndr+0x24 > mdb: target stopped at: > ld.so.1`elf_rtbndr+0x24:restore > > main::dis > main: save %sp, -0x68, %sp > main+4: st %i0, [%fp + 0x44] > main+8: st %i1, [%fp + 0x48] > main+0xc: sethi %hi(0x10c00), %o1 > main+0x10: or %o1, 0x90, %o1 > main+0x14: clr %o0 > main+0x18: call +0x100ac > <PLT=libc.so.1`write> > main+0x1c: mov 0xb, %o2 > main+0x20: clr [%fp - 0x4] > main+0x24: clr %i0 > main+0x28: ret > main+0x2c: restore > main+0x30: clr %i0 > main+0x34: ret > main+0x38: restore > > There is indeed the libc system call. If I disassemble it: > > libc.so.1`write::dis -n 1 > libc.so.1`write: save %sp, -0x60, %sp > libc.so.1`write+4: ldub [%g7 + 0xdf], %l7 > > With ::next dcmds I can make the probe fire, but I can''t see > instrumentation. At some point I should see dtrace_systrace_syscall in > main right? > > Peter > > On Fri, 2007-02-23 at 15:09 +0800, Oliver Yang wrote: > > Frank Hofmann wrote: > > > Why are you not seeing those trap instructions in your app''s code ? > > > Because they''re in libc only. The app is not allowed to care how > > > exactly a system call is done - it calls the libc function, via the > > > procedure linkage table that ld.so fills in when loading/linking the > > > app. Try the following to see this stuff: > > > > > > 1. Compile + Link your test program > > > 2. load it into mdb but do not run it yet. > > > 3. disassemble main(), find the PLT:... entries > > > 4. put a breakpoint at main (main::bp does it in mdb) > > > 5. run the program > > > 6. when it hits the breakpoint, disassemble it again > > > > > > You''ll find that the PLT:... entries have been replaced by the actual > > > libc function entry points. That''s the linker''s work. > > This shouldn''t work, if you just set breakpoint at main. Because it''s a > > dynamic binding, the runtime linker get the real address > > while the function is called at first time, if you really want to > > observe it, you should set a breakpoint before "ld.so.1`elf_rtbndr" > > returned, > > then run it, and disassemble main again. > > > > > > > > If you disassemble those libc funcs, you''ll then find the actual > > > ''syscall'' instruction (on amd64 it''s indeed ''syscall''). > > > > > > Bye, > > > FrankH. > > > > > > On Thu, 22 Feb 2007, Peter Boros wrote: > > > > > >> Hi! > > >> > > >> I want to see how the syscall instrumentation work in assembly level, so > > >> similar to this: > > >> > > >>> ufs_write::dis -n 3 > > >> ufs_write: save %sp, -0x110, %sp > > >> ufs_write+4: stx %i4, [%sp + 0x8bf] > > >> ufs_write+8: mov %i0, %i5 > > >> ufs_write+0xc: ldx [%i0 + 0x10], %i4 > > >> > > >>> ufs_write::dis -n 3 > > >> ufs_write: ba,a +0x19814c <0x14c95dc> > > >> ufs_write+4: stx %i4, [%sp + 0x8bf] > > >> ufs_write+8: mov %i0, %i5 > > >> ufs_write+0xc: ldx [%i0 + 0x10], %i4 > > >> > > >>> ufs_write+0x19814c::dis > > >> 0x14c95b4: sethi %hi(0x1331000), %g1 > > >> 0x14c95b8: call +0x79ebc0e8 <dtrace_probe> > > >> 0x14c95bc: or %g1, 0xc8, %o7 > > >> 0x14c95c0: sethi %hi(0x4000), %o0 > > >> 0x14c95c4: or %o0, 0x98, %o0 > > >> 0x14c95c8: mov 0x300, %o1 > > >> 0x14c95cc: call +0x79ebc0d4 <dtrace_probe> > > >> 0x14c95d0: mov %i0, %o2 > > >> 0x14c95d4: ret > > >> 0x14c95d8: restore > > >> --- > > >> 0x14c95dc: save %sp, -0x110, %sp > > >> 0x14c95e0: sethi %hi(0x4000), %o0 > > >> 0x14c95e4: or %o0, 0x99, %o0 > > >> 0x14c95e8: mov %i0, %o1 > > >> 0x14c95ec: mov %i1, %o2 > > >> 0x14c95f0: mov %i2, %o3 > > >> 0x14c95f4: mov %i3, %o4 > > >> 0x14c95f8: mov %i4, %o5 > > >> 0x14c95fc: sethi %hi(0x1331400), %g1 > > >> 0x14c9600: call +0x79ebc0a0 <dtrace_probe> > > >> 0x14c9604: or %g1, 0x8c, %o7 > > >> > > >> So, to examine this, I wrote a program, which makes a system call: > > >> #include <unistd.h> > > >> int main(int argc, char *argv[]) { > > >> write(0,"helloworld\n",11); > > >> return 0; > > >> } > > >> > > >> So, I start to examing it with mdb: > > >> mdb ./syscall > > >>> main:b > > >>> :r > > >> mdb: stop at main > > >> mdb: target stopped at: > > >> main: save %sp, -0x68, %sp > > >>> .::dis > > >> main: save %sp, -0x68, %sp > > >> main+4: st %i0, [%fp + 0x44] > > >> main+8: st %i1, [%fp + 0x48] > > >> main+0xc: sethi %hi(0x10c00), %o1 > > >> main+0x10: or %o1, 0x90, %o1 > > >> main+0x14: clr %o0 > > >> main+0x18: call +0x100ac <PLT:write> > > >> main+0x1c: mov 0xb, %o2 > > >> main+0x20: clr [%fp - 0x4] > > >> main+0x24: clr %i0 > > >> main+0x28: ret > > >> main+0x2c: restore > > >> main+0x30: clr %i0 > > >> main+0x34: ret > > >> main+0x38: restore > > >> > > >> Okay, the syscall is there, dtrace instuments it, if I turn on the > > >> syscall::write:entry probe. > > >> > > >> When I try to examing write itself I get the same results in > > >> instrumented and non-instrumented case (I followed the brances, it is > > >> the same after that too): > > >>> main+0x100ac::dis > > >> PLT:exit: sethi %hi(0xf000), %g1 > > >> PLT:exit: ba,a -0x40 <PLT:> > > >> PLT:exit: nop > > >> PLT:_exit: sethi %hi(0x12000), %g1 > > >> PLT:_exit: ba,a -0x4c <PLT:> > > >> PLT:_exit: nop > > >> PLT:write: sethi %hi(0x15000), %g1 > > >> PLT:write: ba,a -0x58 <PLT:> > > >> PLT:write: nop > > >> PLT:_get_exit_frame_monitor: sethi %hi(0x18000), %g1 > > >> PLT:_get_exit_frame_monitor: ba,a -0x64 <PLT:> > > >> > > >> I tried to ::step the program through the instrumentation, but when the > > >> probe is on, it conseqently crashes at one instruction (with this, at > > >> some point I should run into dtrace_probe). > > >> > > >> How can I see the effect of system call instrumentation at assembly > > >> level? Maybe it would be easier if I could compile a static binary. I am > > >> using nevada build 56 on sparc. > > >> > > >> Peter > > >> > > >> _______________________________________________ > > >> mdb-discuss mailing list > > >> mdb-discuss at opensolaris.org > > >> > > > _______________________________________________ > > > dtrace-discuss mailing list > > > dtrace-discuss at opensolaris.org > > > > _______________________________________________ > > dtrace-discuss mailing list > > dtrace-discuss at opensolaris.org > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Oliver Yang
2007-Feb-27 01:53 UTC
[dtrace-discuss] Re: [mdb-discuss] examine dtrace behaviour with mdb
Peter Boros wrote:> Hi! > > I tried this, and I see your point. What I got so far: > -bash-3.00# mdb ./syscall > >> main:b >> :r >> > mdb: stop at main > mdb: target stopped at: > main: save %sp, -0x68, %sp > >> ld.so.1`elf_rtbndr::dis >> > ld.so.1`elf_rtbndr: mov %i7, %o0 > ld.so.1`elf_rtbndr+4: save %sp, -0x60, %sp > ld.so.1`elf_rtbndr+8: srl %g1, 0xa, %o1 > ld.so.1`elf_rtbndr+0xc: ld [%i7 + 0x8], %o0 > ld.so.1`elf_rtbndr+0x10: call +0x19f20 > <ld.so.1`elf_bndr> > ld.so.1`elf_rtbndr+0x14: mov %i0, %o2 > ld.so.1`elf_rtbndr+0x18: mov %o0, %g1 > ld.so.1`elf_rtbndr+0x1c: restore > ld.so.1`elf_rtbndr+0x20: jmp %g1 > ld.so.1`elf_rtbndr+0x24: restore > >> ld.so.1`elf_rtbndr+0x24:b >> :c >> > mdb: stop at ld.so.1`elf_rtbndr+0x24 > mdb: target stopped at: > ld.so.1`elf_rtbndr+0x24:restore > >> main::dis >> > main: save %sp, -0x68, %sp > main+4: st %i0, [%fp + 0x44] > main+8: st %i1, [%fp + 0x48] > main+0xc: sethi %hi(0x10c00), %o1 > main+0x10: or %o1, 0x90, %o1 > main+0x14: clr %o0 > main+0x18: call +0x100ac > <PLT=libc.so.1`write> > main+0x1c: mov 0xb, %o2 > main+0x20: clr [%fp - 0x4] > main+0x24: clr %i0 > main+0x28: ret > main+0x2c: restore > main+0x30: clr %i0 > main+0x34: ret > main+0x38: restore > > There is indeed the libc system call. If I disassemble it: > >> libc.so.1`write::dis -n 1 >> > libc.so.1`write: save %sp, -0x60, %sp > libc.so.1`write+4: ldub [%g7 + 0xdf], %l7 > > With ::next dcmds I can make the probe fire, but I can''t see > instrumentation. At some point I should see dtrace_systrace_syscall in > main right? > > >This just libc wrapper for write, it''s in user land. If you disassemble the whole function of libc.so.1`write, you might find it will issue a trap for system call as Frank Hofmann mentioned in previous mail. If you want check the with kernel land, you have to use mdb -k. And if you want set breakpoint in kernel, you should use kmdb. While you try to enter kmdb by "mdb -K", please ensure you invoke the command in console instead of x-window.> On Fri, 2007-02-23 at 15:09 +0800, Oliver Yang wrote: > >> Frank Hofmann wrote: >> >>> Why are you not seeing those trap instructions in your app''s code ? >>> Because they''re in libc only. The app is not allowed to care how >>> exactly a system call is done - it calls the libc function, via the >>> procedure linkage table that ld.so fills in when loading/linking the >>> app. Try the following to see this stuff: >>> >>> 1. Compile + Link your test program >>> 2. load it into mdb but do not run it yet. >>> 3. disassemble main(), find the PLT:... entries >>> 4. put a breakpoint at main (main::bp does it in mdb) >>> 5. run the program >>> 6. when it hits the breakpoint, disassemble it again >>> >>> You''ll find that the PLT:... entries have been replaced by the actual >>> libc function entry points. That''s the linker''s work. >>> >> This shouldn''t work, if you just set breakpoint at main. Because it''s a >> dynamic binding, the runtime linker get the real address >> while the function is called at first time, if you really want to >> observe it, you should set a breakpoint before "ld.so.1`elf_rtbndr" >> returned, >> then run it, and disassemble main again. >> >> >>> If you disassemble those libc funcs, you''ll then find the actual >>> ''syscall'' instruction (on amd64 it''s indeed ''syscall''). >>> >>> Bye, >>> FrankH. >>> >>> On Thu, 22 Feb 2007, Peter Boros wrote: >>> >>> >>>> Hi! >>>> >>>> I want to see how the syscall instrumentation work in assembly level, so >>>> similar to this: >>>> >>>> >>>>> ufs_write::dis -n 3 >>>>> >>>> ufs_write: save %sp, -0x110, %sp >>>> ufs_write+4: stx %i4, [%sp + 0x8bf] >>>> ufs_write+8: mov %i0, %i5 >>>> ufs_write+0xc: ldx [%i0 + 0x10], %i4 >>>> >>>> >>>>> ufs_write::dis -n 3 >>>>> >>>> ufs_write: ba,a +0x19814c <0x14c95dc> >>>> ufs_write+4: stx %i4, [%sp + 0x8bf] >>>> ufs_write+8: mov %i0, %i5 >>>> ufs_write+0xc: ldx [%i0 + 0x10], %i4 >>>> >>>> >>>>> ufs_write+0x19814c::dis >>>>> >>>> 0x14c95b4: sethi %hi(0x1331000), %g1 >>>> 0x14c95b8: call +0x79ebc0e8 <dtrace_probe> >>>> 0x14c95bc: or %g1, 0xc8, %o7 >>>> 0x14c95c0: sethi %hi(0x4000), %o0 >>>> 0x14c95c4: or %o0, 0x98, %o0 >>>> 0x14c95c8: mov 0x300, %o1 >>>> 0x14c95cc: call +0x79ebc0d4 <dtrace_probe> >>>> 0x14c95d0: mov %i0, %o2 >>>> 0x14c95d4: ret >>>> 0x14c95d8: restore >>>> --- >>>> 0x14c95dc: save %sp, -0x110, %sp >>>> 0x14c95e0: sethi %hi(0x4000), %o0 >>>> 0x14c95e4: or %o0, 0x99, %o0 >>>> 0x14c95e8: mov %i0, %o1 >>>> 0x14c95ec: mov %i1, %o2 >>>> 0x14c95f0: mov %i2, %o3 >>>> 0x14c95f4: mov %i3, %o4 >>>> 0x14c95f8: mov %i4, %o5 >>>> 0x14c95fc: sethi %hi(0x1331400), %g1 >>>> 0x14c9600: call +0x79ebc0a0 <dtrace_probe> >>>> 0x14c9604: or %g1, 0x8c, %o7 >>>> >>>> So, to examine this, I wrote a program, which makes a system call: >>>> #include <unistd.h> >>>> int main(int argc, char *argv[]) { >>>> write(0,"helloworld\n",11); >>>> return 0; >>>> } >>>> >>>> So, I start to examing it with mdb: >>>> mdb ./syscall >>>> >>>>> main:b >>>>> :r >>>>> >>>> mdb: stop at main >>>> mdb: target stopped at: >>>> main: save %sp, -0x68, %sp >>>> >>>>> .::dis >>>>> >>>> main: save %sp, -0x68, %sp >>>> main+4: st %i0, [%fp + 0x44] >>>> main+8: st %i1, [%fp + 0x48] >>>> main+0xc: sethi %hi(0x10c00), %o1 >>>> main+0x10: or %o1, 0x90, %o1 >>>> main+0x14: clr %o0 >>>> main+0x18: call +0x100ac <PLT:write> >>>> main+0x1c: mov 0xb, %o2 >>>> main+0x20: clr [%fp - 0x4] >>>> main+0x24: clr %i0 >>>> main+0x28: ret >>>> main+0x2c: restore >>>> main+0x30: clr %i0 >>>> main+0x34: ret >>>> main+0x38: restore >>>> >>>> Okay, the syscall is there, dtrace instuments it, if I turn on the >>>> syscall::write:entry probe. >>>> >>>> When I try to examing write itself I get the same results in >>>> instrumented and non-instrumented case (I followed the brances, it is >>>> the same after that too): >>>> >>>>> main+0x100ac::dis >>>>> >>>> PLT:exit: sethi %hi(0xf000), %g1 >>>> PLT:exit: ba,a -0x40 <PLT:> >>>> PLT:exit: nop >>>> PLT:_exit: sethi %hi(0x12000), %g1 >>>> PLT:_exit: ba,a -0x4c <PLT:> >>>> PLT:_exit: nop >>>> PLT:write: sethi %hi(0x15000), %g1 >>>> PLT:write: ba,a -0x58 <PLT:> >>>> PLT:write: nop >>>> PLT:_get_exit_frame_monitor: sethi %hi(0x18000), %g1 >>>> PLT:_get_exit_frame_monitor: ba,a -0x64 <PLT:> >>>> >>>> I tried to ::step the program through the instrumentation, but when the >>>> probe is on, it conseqently crashes at one instruction (with this, at >>>> some point I should run into dtrace_probe). >>>> >>>> How can I see the effect of system call instrumentation at assembly >>>> level? Maybe it would be easier if I could compile a static binary. I am >>>> using nevada build 56 on sparc. >>>> >>>> Peter >>>> >>>> _______________________________________________ >>>> mdb-discuss mailing list >>>> mdb-discuss at opensolaris.org >>>> >>>> >>> _______________________________________________ >>> dtrace-discuss mailing list >>> dtrace-discuss at opensolaris.org >>> >> _______________________________________________ >> dtrace-discuss mailing list >> dtrace-discuss at opensolaris.org >> > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >