Mark Harrison
2008-Aug-07 20:31 UTC
[dtrace-discuss] system unresponsive error with file upload monitor script
Hi all, I''m trying to write a dtrace script to print a list of files uploaded via sftp. I have it working pretty well, except after the script has been running for a few minutes, I get the "dtrace: processing aborted: Abort due to systemic unresponsiveness" error. Is there anything I can use to try to find the source of the error, or is there something I am missing? Regards, Mark Here''s the script so far (ran with dtrace -q -C -s scriptname.d): #define checkdir "/zones/somezone/root/path/to/uploads" /* I''m triggering on a fop_open after fop_create (fop_create doesn''t always * have the file path available, whereas fop_open does), hence the self->icare * variable. */ fsinfo::fop_create: / arg0 / { self->icare = 1; } /* Check to make sure the file was created via sftp, and check that it is in * the right dir (the beginning of the path has to match checkdir */ fsinfo::fop_open: / self->icare == 1 && execname == "sftp-server" && ((vnode_t *)arg0)->v_path && strstr(((vnode_t *)arg0)->v_path, checkdir) == ((vnode_t *)arg0)->v_path / { printf("%s\n", stringof(((vnode_t*)arg0)->v_path)); } /* If we don''t have the right dir, then we need to reset the icare var so we * don''t check every fop_open from now on */ fsinfo::fop_open: / self->icare == 1 / { self->icare = 0; } -- Mark Harrison Systems Administrator OmniTI Computer Consulting, Inc. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20080807/9acdbbce/attachment.bin>
Chad Mynhier
2008-Aug-08 00:44 UTC
[dtrace-discuss] system unresponsive error with file upload monitor script
On Thu, Aug 7, 2008 at 4:31 PM, Mark Harrison <mark at omniti.com> wrote:> Hi all, > > I''m trying to write a dtrace script to print a list of files uploaded via > sftp. I have it working pretty well, except after the script has been running > for a few minutes, I get the "dtrace: processing aborted: Abort due to > systemic unresponsiveness" error.As a workaround, you might try running with the -w flag ("permit destructive actions".) This will skip the deadman timeout processing. (This will only work if you''re running as root, though.) I''m curious, though, what hardware is this? I''ve seen similar behavior before on an x4600 (16 cores) that was likely a problem with skewed tsc values that cause some gethrtime() weirdness. Chad
Adam Leventhal
2008-Aug-08 00:45 UTC
[dtrace-discuss] system unresponsive error with file upload monitor script
Well, you can disable the DTrace liveness checking by enabling destructive actions or you could try using an aggregation rather than printf() to trace each open. Adam On Thu, Aug 07, 2008 at 04:31:31PM -0400, Mark Harrison wrote:> Hi all, > > I''m trying to write a dtrace script to print a list of files uploaded via > sftp. I have it working pretty well, except after the script has been running > for a few minutes, I get the "dtrace: processing aborted: Abort due to > systemic unresponsiveness" error. > > Is there anything I can use to try to find the source of the error, or is > there something I am missing? > > Regards, > > Mark > > Here''s the script so far (ran with dtrace -q -C -s scriptname.d): > > #define checkdir "/zones/somezone/root/path/to/uploads" > > /* I''m triggering on a fop_open after fop_create (fop_create doesn''t always > * have the file path available, whereas fop_open does), hence the self->icare > * variable. */ > > fsinfo::fop_create: / > arg0 > / { > self->icare = 1; > } > > /* Check to make sure the file was created via sftp, and check that it is in > * the right dir (the beginning of the path has to match checkdir */ > fsinfo::fop_open: / > self->icare == 1 && > execname == "sftp-server" && > ((vnode_t *)arg0)->v_path && > strstr(((vnode_t *)arg0)->v_path, checkdir) > == ((vnode_t *)arg0)->v_path > / { > printf("%s\n", stringof(((vnode_t*)arg0)->v_path)); > } > > /* If we don''t have the right dir, then we need to reset the icare var so we > * don''t check every fop_open from now on */ > fsinfo::fop_open: / > self->icare == 1 > / { > self->icare = 0; > } > > -- > Mark Harrison > Systems Administrator > OmniTI Computer Consulting, Inc.> _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- Adam Leventhal, Fishworks http://blogs.sun.com/ahl
Mark Harrison
2008-Aug-08 15:07 UTC
[dtrace-discuss] system unresponsive error with file upload monitor script
> As a workaround, you might try running with the -w flag ("permit > destructive actions".) This will skip the deadman timeout processing. > (This will only work if you''re running as root, though.)I may end up doing that. I''m concerned that it is stopping for a reason however, and would like to see if I can get it working without needing to turn on destructive actions.> I''m curious, though, what hardware is this? I''ve seen similar > behavior before on an x4600 (16 cores) that was likely a problem with > skewed tsc values that cause some gethrtime() weirdness.This is two dual-core Opterons. -- Mark Harrison Systems Administrator OmniTI Computer Consulting, Inc. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20080808/8bde9666/attachment.bin>
Chad Mynhier
2008-Aug-08 17:50 UTC
[dtrace-discuss] system unresponsive error with file upload monitor script
On Fri, Aug 8, 2008 at 10:07 AM, Mark Harrison <mark at omniti.com> wrote:>> As a workaround, you might try running with the -w flag ("permit >> destructive actions".) This will skip the deadman timeout processing. >> (This will only work if you''re running as root, though.) > I may end up doing that. I''m concerned that it is stopping for a reason > however, and would like to see if I can get it working without needing to > turn on destructive actions.The bug I''m describing is 6507659 ("tsc differences between CPU''s give dtrace_gethrtime() serious problems".) This is fixed in snv_58 and is available in patch 127112-03 (obsoleted by 127128-11.) If you''re running a version with this bug fixed, then your problem is likely something else. Chad