Richard Lowe
2005-Dec-17 11:32 UTC
[dtrace-discuss] Pid provider segfault possibly introduced by fix for 6322760
The fix for: 6322760 dt_pid.c needs to rework its use of xyerror() Appears to break error reporting in certain pid provider cases. % dtrace -qnpid''$target''::: -p 26303 Will now (b27a and b28 tested) segfault rather than giving the error: dtrace: invalid probe specifier pid$target:::: failed to create probe in process 26303: Not enough space This seems to be because of the semantic change in error reporting that 6322760 introduced. xyerror() used to longjmp to a jump_buf that may no longer exist after parsing. However the introduced dt_pid_error() routine does not longjmp at all, and at least one error path (this one) depended on that behaviour. dt_pid_per_mod() calls Psymbol_iter_by_addr() twice in the globbing case, both times with dt_pid_sym_filt() as the callback. dt_pid_sym_filt() invokes dt_pid_per_sym(), which in the create_err label, calls dt_release_proc() to Prelease the process, and destroy various other state. In rev 1.17 of dt_pid.c xyerror() would then be called which would longjmp back past the second call to Psymbol_iter_by_addr(), and all was well. however, with rev 1.18, that call was changed to dt_pid_error() which no longer longjmps, and allows the second call to Psymbol_iter_by_addr() in dt_pid_per_mod() to be entered, which will access the pt_pshandle and other state after it''s been freed in the above. This will segv with a stack much like: libelf.so.1`gelf_getsym+0xca(82f13f0, eb, 8046b88) libproc.so.1`Psymbol_iter_com+0x146(829efd8, ffffffff, 82736b8, 1, 407, 1) libproc.so.1`Psymbol_iter_by_addr+0x21(829efd8, 82736b8, 1, 407, d0f78cce, 8046d3c) libdtrace.so.1`dt_pid_per_mod+0x190(8046d3c, 82afa70, 82736b8) Two suggested (but possibly bad) fixes would be: --- common/dt_pid.c (revision 461) +++ common/dt_pid.c (local) @@ -78,6 +78,7 @@ va_start(ap, fmt); dt_set_errmsg(dtp, dt_errtag(tag), NULL, NULL, 0, fmt, ap); va_end(ap); + longjmp(dtp->dt_pcb->pcb_jmpbuf, EDT_COMPILER); } To restore the longjmping semantics of the error paths. Or: --- common/dt_pid.c (revision 461) +++ common/dt_pid.c (local) @@ -193,6 +193,7 @@ dt_pid_error(pp->dpp_dtp, D_PROC_CREATEFAIL, "failed to create " "probe in process %d: %s", (int)pid, dtrace_errmsg(pp->dpp_dtp, dtrace_errno(pp->dpp_dtp))); + longjmp(pp->dpp_dtp->dt_pcb->pcb_jmpbuf, EDT_COMPILER); } To only longjmp in that particular instance (as is done in other cases where the pshandle is inaccessible). I haven''t filed a bug yet, as I was hoping to to include the better Suggested Fix, and request a sponsor on it, but need advice on the better way to fix it :)
Adam Leventhal
2005-Dec-19 16:25 UTC
[dtrace-discuss] Pid provider segfault possibly introduced by fix for 6322760
Hey Rich, Thanks for finding that. Please do file the bug and I''ll think some as well about the right fix -- this file got a bit nasty with 6322760. Adam On Sat, Dec 17, 2005 at 06:32:07AM -0500, Richard Lowe wrote:> The fix for: > 6322760 dt_pid.c needs to rework its use of xyerror() > > Appears to break error reporting in certain pid provider cases. > > % dtrace -qnpid''$target''::: -p 26303 > > Will now (b27a and b28 tested) segfault rather than giving the error: > dtrace: invalid probe specifier pid$target:::: failed to create probe > in process 26303: Not enough space > > This seems to be because of the semantic change in error reporting that > 6322760 introduced. xyerror() used to longjmp to a jump_buf that may no > longer exist after parsing. However the introduced dt_pid_error() > routine does not longjmp at all, and at least one error path (this one) > depended on that behaviour. > > dt_pid_per_mod() calls Psymbol_iter_by_addr() twice in the globbing > case, both times with dt_pid_sym_filt() as the callback. > > dt_pid_sym_filt() invokes dt_pid_per_sym(), which in the create_err > label, calls dt_release_proc() to Prelease the process, and destroy > various other state. > > In rev 1.17 of dt_pid.c xyerror() would then be called which would > longjmp back past the second call to Psymbol_iter_by_addr(), and all was > well. > > however, with rev 1.18, that call was changed to dt_pid_error() which no > longer longjmps, and allows the second call to Psymbol_iter_by_addr() in > dt_pid_per_mod() to be entered, which will access the pt_pshandle and > other state after it''s been freed in the above. > > This will segv with a stack much like: > libelf.so.1`gelf_getsym+0xca(82f13f0, eb, 8046b88) > libproc.so.1`Psymbol_iter_com+0x146(829efd8, ffffffff, 82736b8, 1, > 407, 1) > libproc.so.1`Psymbol_iter_by_addr+0x21(829efd8, 82736b8, 1, 407, > d0f78cce, > 8046d3c) > libdtrace.so.1`dt_pid_per_mod+0x190(8046d3c, 82afa70, 82736b8) > > Two suggested (but possibly bad) fixes would be: > --- common/dt_pid.c (revision 461) > +++ common/dt_pid.c (local) > @@ -78,6 +78,7 @@ > va_start(ap, fmt); > dt_set_errmsg(dtp, dt_errtag(tag), NULL, NULL, 0, fmt, ap); > va_end(ap); > + longjmp(dtp->dt_pcb->pcb_jmpbuf, EDT_COMPILER); > } > > To restore the longjmping semantics of the error paths. > > Or: > --- common/dt_pid.c (revision 461) > +++ common/dt_pid.c (local) > @@ -193,6 +193,7 @@ > dt_pid_error(pp->dpp_dtp, D_PROC_CREATEFAIL, "failed to create " > "probe in process %d: %s", (int)pid, > dtrace_errmsg(pp->dpp_dtp, dtrace_errno(pp->dpp_dtp))); > + longjmp(pp->dpp_dtp->dt_pcb->pcb_jmpbuf, EDT_COMPILER); > } > > To only longjmp in that particular instance (as is done in other cases > where the pshandle is inaccessible). > > I haven''t filed a bug yet, as I was hoping to to include the better > Suggested Fix, and request a sponsor on it, but need advice on the > better way to fix it :) > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl