Hi all, I am trying to write a D script which would print ustack() for every program in the system receiving SIGSEGV. All the stacks printed in trap()/sigtoproc() context do not have meaningful symbols. The following solves the problem to some degree but I''d much rather have a self-contained D script. dtrace -w -n ''fbt:genunix:sigtoproc:entry/arg2 == 11/ { self->pid=((proc_t *)arg0)->p_pidp->pid_id; stop(); system("/usr/bin/gcore %d", self->pid); system("/usr/bin/prun %d", self->pid); }'' Any ideas (or code) will be appreciated, v.
On Wed, Oct 01, 2008 at 02:18:55PM +0200, Vladimir Kotal wrote:> > Hi all, > > I am trying to write a D script which would print ustack() for every > program in the system receiving SIGSEGV. All the stacks printed in > trap()/sigtoproc() context do not have meaningful symbols. > > The following solves the problem to some degree but I''d much rather have > a self-contained D script. > > dtrace -w -n ''fbt:genunix:sigtoproc:entry/arg2 == 11/ { > self->pid=((proc_t *)arg0)->p_pidp->pid_id; stop(); > system("/usr/bin/gcore %d", self->pid); system("/usr/bin/prun %d", > self->pid); }''This is stopping the signal sender, not the signal receiver. #!/usr/sbin/dtrace -s #pragma D option destructive #pragma D option quiet proc:::signal-send /args[2] == SIGSEGV/ { segv_sent[args[1]->pr_addr] = 1; } fbt::issig_forreal:entry /segv_sent[(uintptr_t)curthread->t_procp]/ { segv_sent[(uintptr_t)curthread->t_procp] = 0; printf("%6d %s\n", pid, curpsinfo->pr_psargs); ustack(20); stop(); system("/usr/bin/prun %d", pid); } This should work regardless of the source of the segv. (the main trick is calling stop() at the top of issig_forreal(); that will stop the process before the SEGV is processed, letting dtrace get a stack trace from it.) Cheers, - jonathan> Any ideas (or code) will be appreciated, > > > v. > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
On Wed, Oct 1, 2008 at 5:48 PM, Vladimir Kotal <Vladimir.Kotal at sun.com> wrote:> > Hi all, > > I am trying to write a D script which would print ustack() for every > program in the system receiving SIGSEGV. All the stacks printed in > trap()/sigtoproc() context do not have meaningful symbols. > > The following solves the problem to some degree but I''d much rather have > a self-contained D script. >The appcrash utility works beautifully and is for a similar purpose. The script is not self-contained though: http://blogs.sun.com/gregns/entry/making_system_wide_appcrash_to http://developers.sun.com/solaris/articles/app_crash/app_crash.html -Shiv
S h i v wrote: <snip>> The appcrash utility works beautifully and is for a similar purpose. > The script is not self-contained though: > > http://blogs.sun.com/gregns/entry/making_system_wide_appcrash_to > http://developers.sun.com/solaris/articles/app_crash/app_crash.htmlI have not heard about appcrash until now, looks very useful. It would be interesting to see how dtrace script can be "daemonized" via SMF and how it behaves in SMF environment (in particular in terms of service states). Also, having the service to report top N crash ustacks periodically (to a log file e.g.) would be useful (using aggregations). v.
Jonathan Adams wrote:> On Wed, Oct 01, 2008 at 02:18:55PM +0200, Vladimir Kotal wrote:<snip>>> dtrace -w -n ''fbt:genunix:sigtoproc:entry/arg2 == 11/ { >> self->pid=((proc_t *)arg0)->p_pidp->pid_id; stop(); >> system("/usr/bin/gcore %d", self->pid); system("/usr/bin/prun %d", >> self->pid); }'' > > This is stopping the signal sender, not the signal receiver.I see. In my scenario it worked because it was the case of HW caused trap so ttoproc(curthread) was equal to first argument of sigtoproc(). BTW appcrash ensures this via ''pid == args[1]->pr_pid'' condition in the predicate used for proc:::signal-send. <snip>> stop(); > system("/usr/bin/prun %d", pid); > }Maybe stupid/ignorant question but I''ll ask anyway: why there is no start()/run() in dtrace ?> This should work regardless of the source of the segv. (the main trick is > calling stop() at the top of issig_forreal(); that will stop the process before > the SEGV is processed, letting dtrace get a stack trace from it.)The script works fine, thanks a lot for it. v.
On Thu, Oct 02, 2008 at 12:31:30PM +0200, Vladimir Kotal wrote:> Jonathan Adams wrote: > >On Wed, Oct 01, 2008 at 02:18:55PM +0200, Vladimir Kotal wrote: > > <snip> > > >>dtrace -w -n ''fbt:genunix:sigtoproc:entry/arg2 == 11/ { > >>self->pid=((proc_t *)arg0)->p_pidp->pid_id; stop(); > >>system("/usr/bin/gcore %d", self->pid); system("/usr/bin/prun %d", > >>self->pid); }'' > > > >This is stopping the signal sender, not the signal receiver. > > I see. In my scenario it worked because it was the case of HW caused > trap so ttoproc(curthread) was equal to first argument of sigtoproc().Indeed.> BTW appcrash ensures this via ''pid == args[1]->pr_pid'' condition in the > predicate used for proc:::signal-send. > > <snip> > > > stop(); > > system("/usr/bin/prun %d", pid); > >} > > Maybe stupid/ignorant question but I''ll ask anyway: why there is no > start()/run() in dtrace ?Probably because we want to minimize "funky" processed-in-user-context actions. system(prun) makes it obvious what''s going on.> >This should work regardless of the source of the segv. (the main trick is > >calling stop() at the top of issig_forreal(); that will stop the process > >before > >the SEGV is processed, letting dtrace get a stack trace from it.) > > The script works fine, thanks a lot for it.I''m glad it works for you. Cheers, - jonathan