Richard Lowe
2005-Dec-17 11:32 UTC
[dtrace-discuss] Pid provider segfault possibly introduced by fix for 6322760
The fix for:
6322760 dt_pid.c needs to rework its use of xyerror()
Appears to break error reporting in certain pid provider cases.
% dtrace -qnpid''$target''::: -p 26303
Will now (b27a and b28 tested) segfault rather than giving the error:
dtrace: invalid probe specifier pid$target:::: failed to create probe
in process 26303: Not enough space
This seems to be because of the semantic change in error reporting that
6322760 introduced. xyerror() used to longjmp to a jump_buf that may no
longer exist after parsing. However the introduced dt_pid_error()
routine does not longjmp at all, and at least one error path (this one)
depended on that behaviour.
dt_pid_per_mod() calls Psymbol_iter_by_addr() twice in the globbing
case, both times with dt_pid_sym_filt() as the callback.
dt_pid_sym_filt() invokes dt_pid_per_sym(), which in the create_err
label, calls dt_release_proc() to Prelease the process, and destroy
various other state.
In rev 1.17 of dt_pid.c xyerror() would then be called which would
longjmp back past the second call to Psymbol_iter_by_addr(), and all was
well.
however, with rev 1.18, that call was changed to dt_pid_error() which no
longer longjmps, and allows the second call to Psymbol_iter_by_addr() in
dt_pid_per_mod() to be entered, which will access the pt_pshandle and
other state after it''s been freed in the above.
This will segv with a stack much like:
libelf.so.1`gelf_getsym+0xca(82f13f0, eb, 8046b88)
libproc.so.1`Psymbol_iter_com+0x146(829efd8, ffffffff, 82736b8, 1,
407, 1)
libproc.so.1`Psymbol_iter_by_addr+0x21(829efd8, 82736b8, 1, 407,
d0f78cce,
8046d3c)
libdtrace.so.1`dt_pid_per_mod+0x190(8046d3c, 82afa70, 82736b8)
Two suggested (but possibly bad) fixes would be:
--- common/dt_pid.c (revision 461)
+++ common/dt_pid.c (local)
@@ -78,6 +78,7 @@
va_start(ap, fmt);
dt_set_errmsg(dtp, dt_errtag(tag), NULL, NULL, 0, fmt, ap);
va_end(ap);
+ longjmp(dtp->dt_pcb->pcb_jmpbuf, EDT_COMPILER);
}
To restore the longjmping semantics of the error paths.
Or:
--- common/dt_pid.c (revision 461)
+++ common/dt_pid.c (local)
@@ -193,6 +193,7 @@
dt_pid_error(pp->dpp_dtp, D_PROC_CREATEFAIL, "failed to create
"
"probe in process %d: %s", (int)pid,
dtrace_errmsg(pp->dpp_dtp, dtrace_errno(pp->dpp_dtp)));
+ longjmp(pp->dpp_dtp->dt_pcb->pcb_jmpbuf, EDT_COMPILER);
}
To only longjmp in that particular instance (as is done in other cases
where the pshandle is inaccessible).
I haven''t filed a bug yet, as I was hoping to to include the better
Suggested Fix, and request a sponsor on it, but need advice on the
better way to fix it :)
Adam Leventhal
2005-Dec-19 16:25 UTC
[dtrace-discuss] Pid provider segfault possibly introduced by fix for 6322760
Hey Rich, Thanks for finding that. Please do file the bug and I''ll think some as well about the right fix -- this file got a bit nasty with 6322760. Adam On Sat, Dec 17, 2005 at 06:32:07AM -0500, Richard Lowe wrote:> The fix for: > 6322760 dt_pid.c needs to rework its use of xyerror() > > Appears to break error reporting in certain pid provider cases. > > % dtrace -qnpid''$target''::: -p 26303 > > Will now (b27a and b28 tested) segfault rather than giving the error: > dtrace: invalid probe specifier pid$target:::: failed to create probe > in process 26303: Not enough space > > This seems to be because of the semantic change in error reporting that > 6322760 introduced. xyerror() used to longjmp to a jump_buf that may no > longer exist after parsing. However the introduced dt_pid_error() > routine does not longjmp at all, and at least one error path (this one) > depended on that behaviour. > > dt_pid_per_mod() calls Psymbol_iter_by_addr() twice in the globbing > case, both times with dt_pid_sym_filt() as the callback. > > dt_pid_sym_filt() invokes dt_pid_per_sym(), which in the create_err > label, calls dt_release_proc() to Prelease the process, and destroy > various other state. > > In rev 1.17 of dt_pid.c xyerror() would then be called which would > longjmp back past the second call to Psymbol_iter_by_addr(), and all was > well. > > however, with rev 1.18, that call was changed to dt_pid_error() which no > longer longjmps, and allows the second call to Psymbol_iter_by_addr() in > dt_pid_per_mod() to be entered, which will access the pt_pshandle and > other state after it''s been freed in the above. > > This will segv with a stack much like: > libelf.so.1`gelf_getsym+0xca(82f13f0, eb, 8046b88) > libproc.so.1`Psymbol_iter_com+0x146(829efd8, ffffffff, 82736b8, 1, > 407, 1) > libproc.so.1`Psymbol_iter_by_addr+0x21(829efd8, 82736b8, 1, 407, > d0f78cce, > 8046d3c) > libdtrace.so.1`dt_pid_per_mod+0x190(8046d3c, 82afa70, 82736b8) > > Two suggested (but possibly bad) fixes would be: > --- common/dt_pid.c (revision 461) > +++ common/dt_pid.c (local) > @@ -78,6 +78,7 @@ > va_start(ap, fmt); > dt_set_errmsg(dtp, dt_errtag(tag), NULL, NULL, 0, fmt, ap); > va_end(ap); > + longjmp(dtp->dt_pcb->pcb_jmpbuf, EDT_COMPILER); > } > > To restore the longjmping semantics of the error paths. > > Or: > --- common/dt_pid.c (revision 461) > +++ common/dt_pid.c (local) > @@ -193,6 +193,7 @@ > dt_pid_error(pp->dpp_dtp, D_PROC_CREATEFAIL, "failed to create " > "probe in process %d: %s", (int)pid, > dtrace_errmsg(pp->dpp_dtp, dtrace_errno(pp->dpp_dtp))); > + longjmp(pp->dpp_dtp->dt_pcb->pcb_jmpbuf, EDT_COMPILER); > } > > To only longjmp in that particular instance (as is done in other cases > where the pshandle is inaccessible). > > I haven''t filed a bug yet, as I was hoping to to include the better > Suggested Fix, and request a sponsor on it, but need advice on the > better way to fix it :) > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl