Sean McGrath - Sun Microsystems Ireland
2005-Aug-31 14:28 UTC
[dtrace-discuss] fork and execve dtrace behaviour weirdness...
Ho folks. Got a weird thing going on here with some simply dtrace scripts. I''ve got two c-programs, execv_it and spawned. execv_it just fork()s and execv''s the spawned binary. The spawned binary just calls some maths functions. Its a test case elvoved from trying to things in gnome... (specifically gnome-session forking and execve''ng gconf-sanity-check-2) runs like: $ ./execv_it spawned All programs, dtrace scripts in the attached tar ball. What I''d like is to dtrace the spawned process to collect data on all libraries it uses (in this case libc and libm). I''m using two dtrace scripts to get this. The first, d1.d, catches the process on proc:::start, stops the process and gets the second dtrace script, d2.d, to dtrace the spawned process. Theres a few strange things going on.. (happens on s10, snv_XY, sparc and x86...) First: Sometimes it works, sometimes it doesn''t. The d1.d script catches the execv_it process and can trace as far as the execve() call. But sometimes thats where it stops and doesn''t continue. If I stop the d1.d script and start again it may go on and dtrace the spawned process fine. output: ... <- execv_it LM1`ld.so.1 leave <- execv_it LM1`ld.so.1 elf_bndr <- execv_it LM1`ld.so.1 elf_rtbndr -> execv_it libc.so.1 execve ** sometimes it never gets past here ** <- spawned LM1`ld.so.1 rtld_db_dlactivity <- spawned LM1`ld.so.1 rd_event ... Secondly: Sometimes both the spawned process and dtrace itself core dump.. cores, pstacks available on request :) Though heres the two dtrace scripts. I''m using proc:::start since using syscall::fork*:entry I won''t see the libm calls etc. Thirdly: John Rice had a go with these already (hes trying to dtrace fully the gconf-sanity-check-2 process). He tells me that changing the printf() calls in the execv_it.c can induce further weirdness; where d2.d script fails to prun the spawned process or gets no output. Thoughts ? Any glaring mistakes I''ve made :) Regards, Sean. . === d1.d #!/usr/sbin/dtrace -s #pragma D option destructive #pragma D option quiet BEGIN { printf("d1: starting d1\n"); } proc:::start /execname == $$1/ { stop(); printf("d1: stopped %d and running d2.d\n", pid); system("(sleep 2; /uasr/sbin/dtrace -o /tmp/d.%d -s d2.d -p %d)&", pid, pid); /* exit(0); */ } === d2.d #!/usr/sbin/dtrace -s #pragma D option destructive #pragma D option quiet #pragma D option flowindent BEGIN { printf("d2: prunning %d\n", $target); system("prun %d", $target); } pid$target:::entry { printf("-> %s %-12s %-12s\n", execname, probemod, probefunc); } pid$target:::return { printf("<- %s %-12s %-12s\n", execname, probemod, probefunc); } ==-------------- next part -------------- A non-text attachment was scrubbed... Name: dtrace.weird.tar.gz Type: application/x-tar-gz Size: 1051 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20050831/d56fd835/attachment.tgz>
Jonathan Adams
2005-Sep-01 00:01 UTC
[dtrace-discuss] fork and execve dtrace behaviour weirdness...
On Wed, Aug 31, 2005 at 03:28:06PM +0100, Sean McGrath - Sun Microsystems Ireland wrote:> > Ho folks. > > Got a weird thing going on here with some simply dtrace scripts. > > I''ve got two c-programs, execv_it and spawned. execv_it just fork()s and > execv''s the spawned binary. The spawned binary just calls some maths > functions. Its a test case elvoved from trying to things in gnome... > (specifically gnome-session forking and execve''ng gconf-sanity-check-2)What version of Solaris is this? 10? Nevada? Cheers, - jonathan -- Jonathan Adams, Solaris Kernel Development
Sean McGrath - Sun Microsystems Ireland
2005-Sep-01 08:29 UTC
[dtrace-discuss] fork and execve dtrace behaviour weirdness...
Jonathan Adams stated: < On Wed, Aug 31, 2005 at 03:28:06PM +0100, Sean McGrath - Sun Microsystems Ireland wrote: < > < > Ho folks. < > < > Got a weird thing going on here with some simply dtrace scripts. < > < > I''ve got two c-programs, execv_it and spawned. execv_it just fork()s and < > execv''s the spawned binary. The spawned binary just calls some maths < > functions. Its a test case elvoved from trying to things in gnome... < > (specifically gnome-session forking and execve''ng gconf-sanity-check-2) < < What version of Solaris is this? 10? Nevada? s10 and Nevada, both sparc and x86. Regards, < < Cheers, < - jonathan < < -- < Jonathan Adams, Solaris Kernel Development -- Sean. . sean DoT mcgrath At sun.com http://blogs.sun.com/smg http://www.sun.com/software/solaris
sivakumar
2005-Sep-01 10:55 UTC
[dtrace-discuss] Re: fork and execve dtrace behaviour weirdness...
After forking try to introduce a sleep for couple of seconds and then try to run it. Now it will be able to catch all the forked process. Regards, Siva Kumar.K This message posted from opensolaris.org
Sean McGrath - Sun Microsystems Ireland
2005-Sep-01 14:49 UTC
[dtrace-discuss] Re: fork and execve dtrace behaviour weirdness...
Tried that; 2, 5, 10, 15 seconds. All to no avail :( Catching a fork[1]() is fine, can do no problem. Its the execve() that the process does later and I want to dtrace that execv''d process. Using proc:::start was the closest I could get. Regards, Sean. . sivakumar stated: < After forking try to introduce a sleep for couple of seconds and then try to run it. < Now it will be able to catch all the forked process. < < Regards, < Siva Kumar.K < This message posted from opensolaris.org < _______________________________________________ < dtrace-discuss mailing list < dtrace-discuss at opensolaris.org
Wee Yeh Tan
2005-Sep-01 15:12 UTC
[dtrace-discuss] Re: fork and execve dtrace behaviour weirdness...
On 9/1/05, Sean McGrath - Sun Microsystems Ireland <sean.mcgrath at sun.com> wrote:> > Tried that; 2, 5, 10, 15 seconds. All to no avail :( > > Catching a fork[1]() is fine, can do no problem. Its the execve() that the > process does later and I want to dtrace that execv''d process. Using > proc:::start was the closest I could get.I am just guessing that the sleep in d1.d really doesn''t matter because the process is stopped. You likely caught the child before it does the execve(). You might want to try proc:::exec-success in place of proc:::start. -- Just me, Wire ...
Sean McGrath - Sun Microsystems Ireland
2005-Sep-01 15:29 UTC
[dtrace-discuss] Re: fork and execve dtrace behaviour weirdness...
Wee Yeh Tan stated: < On 9/1/05, Sean McGrath - Sun Microsystems Ireland <sean.mcgrath at sun.com> wrote: < > < > Tried that; 2, 5, 10, 15 seconds. All to no avail :( < > < > Catching a fork[1]() is fine, can do no problem. Its the execve() that the < > process does later and I want to dtrace that execv''d process. Using < > proc:::start was the closest I could get. < < I am just guessing that the sleep in d1.d really doesn''t matter < because the process is stopped. You likely caught the child before it < does the execve(). < < You might want to try proc:::exec-success in place of proc:::start. Tried that too :) Doesn''t catch it at the right place to allow tracing of any libraries that the process may load up (I think). proc:::start was the only probe to let me trace the exec''d process. < < < -- < Just me, < Wire ... -- Sean. . sean DoT mcgrath At sun.com http://blogs.sun.com/smg http://www.sun.com/software/solaris
Adam Leventhal
2005-Sep-08 18:20 UTC
[dtrace-discuss] fork and execve dtrace behaviour weirdness...
Hi Sean,> The d1.d script catches the execv_it process and can trace as far as > the execve() call. But sometimes thats where it stops and doesn''t continue.Tracing will necessarily stop at the exec since all the probes that you had matched prior to the exec are invalidated. Are you trying to trace up to the exec or after the exec? For the former, continue to use proc:::start, for the latter, you may want to use proc:::exec-success.> Secondly: > Sometimes both the spawned process and dtrace itself core dump.. > cores, pstacks available on request :)Yikes! Please send the cores to me in private mail (not that I don''t want other to see them -- I just figure not everyone wants them polluting their mail boxes).> Though heres the two dtrace scripts. I''m using proc:::start since using > syscall::fork*:entry I won''t see the libm calls etc.What''s that about fork*? I don''t understand?> Thirdly: > John Rice had a go with these already (hes trying to dtrace fully the > gconf-sanity-check-2 process). He tells me that changing the printf() > calls in the execv_it.c can induce further weirdness; where d2.d script > fails to prun the spawned process or gets no output.Can you provide more details? Also, can you provide more details about what you''re trying to do precisely? Do you want to see all activity in the spawned process before or after the exec? If you''re using flowindent you don''t need to explicitly printf those arrows (but I suppose it won''t hurt). Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Sean McGrath - Sun Microsystems Ireland
2005-Sep-12 11:14 UTC
[dtrace-discuss] fork and execve dtrace behaviour weirdness...
Adam Leventhal stated: < Hi Sean, < < > The d1.d script catches the execv_it process and can trace as far as < > the execve() call. But sometimes thats where it stops and doesn''t continue. < < Tracing will necessarily stop at the exec since all the probes that you < had matched prior to the exec are invalidated. Are you trying to trace up < to the exec or after the exec? For the former, continue to use proc:::start, < for the latter, you may want to use proc:::exec-success. Will want to trace after the exec. I''ve tried using proc:::exec-success, though it doesn''t see any library calls that the process may use, eg from libm or libgtk. < < > Secondly: < > Sometimes both the spawned process and dtrace itself core dump.. < > cores, pstacks available on request :) < < Yikes! Please send the cores to me in private mail (not that I don''t want < other to see them -- I just figure not everyone wants them polluting their < mail boxes). Wilco. The core dumps are strange, they don''t happen all the time. < < > Though heres the two dtrace scripts. I''m using proc:::start since using < > syscall::fork*:entry I won''t see the libm calls etc. < < What''s that about fork*? I don''t understand? < < > Thirdly: < > John Rice had a go with these already (hes trying to dtrace fully the < > gconf-sanity-check-2 process). He tells me that changing the printf() < > calls in the execv_it.c can induce further weirdness; where d2.d script < > fails to prun the spawned process or gets no output. < < Can you provide more details? < < Also, can you provide more details about what you''re trying to do precisely? < Do you want to see all activity in the spawned process before or after the < exec? If you''re using flowindent you don''t need to explicitly printf those < arrows (but I suppose it won''t hurt). (the arrows were left overs from an earlier version) Details: Logging in to a gnome (aka JDS) session on a desktop or sunray. The first main process to start off the gnome session is gnome-session. This process then forks off /usr/lib/gconf-sanity-check-2 gnome-session +-> forks itself +-> execv /usr/lib/gconf-sanity-check-2 gnome-session then goes and forks off some more processes. Its this /usr/lib/gconf-sanity-check-2 process we want to fully trace. Its causing a slowdown in login times for sunrays. As yet we''re unable to fully trace it with the pid probe. We can only catch the process''s own functions and from ld.so, but nothing from libglib or any other library gconf-sanity-check-2 loads up. Under snv_2X I''ve been able to fully trace the gconf-sanity-check-2, but John wasn''t able to do so on the sunray servers, which run S10 (patched). John Rice''s blog has some more detail, he makes use of the bootchart dtrace tool from Eric and Dan as well. http://blogs.sun.com/roller/page/jmr/20050908 < < Adam < < -- < Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl -- Sean. . sean DoT mcgrath At sun.com http://blogs.sun.com/smg http://www.sun.com/software/solaris
Adam Leventhal
2005-Sep-12 18:36 UTC
[dtrace-discuss] fork and execve dtrace behaviour weirdness...
On Mon, Sep 12, 2005 at 12:14:11PM +0100, Sean McGrath - Sun Microsystems Ireland wrote:> < Yikes! Please send the cores to me in private mail (not that I don''t want > < other to see them -- I just figure not everyone wants them polluting their > < mail boxes). > > Wilco. The core dumps are strange, they don''t happen all the time.You hit a bug in libdtrace which was introduced as part of the work to allow for user-level probe descriptions to be reevaluated when a shared object is opened. I''ve filed this bug: 6322760 dt_pid.c needs to rework its use of xyerror() This bug caused the dtrace command to crash which then caused the traced process to crash because dtrace left some /proc breakpoints behind.> Details: > Logging in to a gnome (aka JDS) session on a desktop or sunray. > The first main process to start off the gnome session is gnome-session. > This process then forks off /usr/lib/gconf-sanity-check-2 > > gnome-session > +-> forks itself > +-> execv /usr/lib/gconf-sanity-check-2 > > gnome-session then goes and forks off some more processes. > > Its this /usr/lib/gconf-sanity-check-2 process we want to fully trace. Its > causing a slowdown in login times for sunrays. As yet we''re unable to > fully trace it with the pid probe. We can only catch the process''s own > functions and from ld.so, but nothing from libglib or any other library > gconf-sanity-check-2 loads up.On Solaris 10, if you stop a traced process immediately after it execs, all you''ll be able to trace is the a.out binary and the linker (as you mentioned). This is because (at exec time) those are the only objects mapped into the process''s address space. By the time you get to main, all libraries should be mapped and will be visible through DTrace.> Under snv_2X I''ve been able to fully trace the gconf-sanity-check-2, > but John wasn''t able to do so on the sunray servers, which run S10 > (patched).In Solaris Nevada, we improved the dtrace command so that it would detect when new objects were loaded into the process and instrument them as well. That''s why you see the right thing on snv_2X but not on S10. Those features will be available soon as a patch to S10.> John Rice''s blog has some more detail, he makes use of the bootchart dtrace > tool from Eric and Dan as well. > > http://blogs.sun.com/roller/page/jmr/20050908That''s cool stuff! Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
John Rice
2005-Sep-13 15:22 UTC
[dtrace-discuss] fork and execve dtrace behaviour weirdness...
Thanks for the info Alan and Sean: >On Solaris 10, if you stop a traced process immediately after it execs, >all you''ll be able to trace is the a.out binary and the linker (as you >mentioned). This is because (at exec time) those are the only objects >mapped into the process''s address space. By the time you get to main, >all libraries should be mapped and will be visible through DTrace. //We''ve tried instrumenting on entry to main() for gnome-session on Sol10 GA, but with limited success: 1. Trigger.d dtrace script - catches exec of gnome-session (or whatever exe we are interested in), then fires off a launch.d script with the PID of gnome-session. 2. Launch.d dtrace script - pid probe on a.out:main:entry, catches entry to gnome-session main() and then fires off the real trace.d script, again passing the PID of gnome-session. 3. Trace.d script - now using pid probes on a.out and also a general pid probe set to fire conditionally when we hit a function of interest, so we can follow all the library calls at this point. Doing the above dtrace will complain about instrumenting a library call - we can''t instrument on various library calls invoked by main() or one of the other gnome-session functions :( However, we can get more informative ustack() traces for various syscalls, triggered when the function of interest has been entered [used to just get hex output, now see most of the stack]. >In Solaris Nevada, we improved the dtrace command so that it would detect >when new objects were loaded into the process and instrument them as well. Great :) .... But unfortunately I had wanted to look at this on live production servers, so a patch that I could put up on some test servers would be the business. If you can send me an update on when the patch would be available that would be fantastic. Thanks, JR -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20050913/03545b2c/attachment.html>
Manoj Joseph
2005-Sep-13 15:53 UTC
[dtrace-discuss] fork and execve dtrace behaviour weirdness...
Hi John, John Rice wrote:> //We''ve tried instrumenting on entry to main() for gnome-session on > Sol10 GA, but with limited success: > > 1. Trigger.d dtrace script - catches exec of gnome-session (or whatever > exe we are interested in), then fires off a launch.d script with the PID > of gnome-session. > 2. Launch.d dtrace script - pid probe on a.out:main:entry, catches entry > to gnome-session main() and then fires off the real trace.d script, > again passing the PID of gnome-session. > 3. Trace.d script - now using pid probes on a.out and also a general pid > probe set to fire conditionally when we hit a function of interest, so > we can follow all the library calls at this point. > > Doing the above dtrace will complain about instrumenting a library call > - we can''t instrument on various library calls invoked by main() or one > of the other gnome-session functions :( However, we can get moreWhat you probably need to do is catch the loading of the library. Please see the following three scripts that does this libc. http://mowgli.india.sun.com/documents/dtrace/scripts/watch.d http://mowgli.india.sun.com/documents/dtrace/scripts/monitor.d http://mowgli.india.sun.com/documents/dtrace/scripts/monitor2.d The block comment at the top of the script mentions how it is to be run. Please let me know if something is not clear. Hope that helps. Cheers Manoj