I am running a small script (attached) that timestamps file opens and
closes.
When I am running it with real load I get tons of messages like
dtrace: 79575 drops on CPU 12
dtrace: 140709 drops on CPU 13
dtrace: 125092 drops on CPU 14
dtrace: 38129 drops on CPU 15
...
and sometimes
dtrace: error on enabled probe ID 4 (ID 10200: fbt:genunix:closeandsetf:entry):
out of scratch space in action #2 at DIF offset 508
dtrace: error on enabled probe ID 2 (ID 6227: fbt:genunix:copen:return): out of
scratch space in action #3 at DIF offset 28
dtrace: error on enabled probe ID 4 (ID 10200: fbt:genunix:closeandsetf:entry):
out of scratch space in action #2 at DIF offset 508
dtrace: error on enabled probe ID 4 (ID 10200: fbt:genunix:closeandsetf:entry):
out of scratch space in action #2 at DIF offset 508
dtrace: error on enabled probe ID 4 (ID 10200: fbt:genunix:closeandsetf:entry):
out of scratch space in action #2 at DIF offset 508
Unfortunately DTrace is not telling what are these drops so I don''t
know
what limit to tune (and why they are happening in the first place).
----------------------------------------------------------------------
#!/usr/sbin/dtrace -Cs
#pragma ident $Id: openrace.d,v 1.2 2007/07/25 21:51:16 akolb Exp akolb $
#pragma D option aggsortrev
#pragma D option quiet
#pragma D option dynvarsize=4m
#include <sys/types.h>
#include <sys/file.h>
inline uf_info_t curufinfo = curthread->t_procp->p_user.u_finfo;
inline string curargs = curpsinfo->pr_psargs;
inline string f_name[int fd] = (fd >= 0 && fd <
curthread->t_procp->p_user.u_finfo.fi_nfiles ?
(curufinfo.fi_list[fd].uf_file ?
(curufinfo.fi_list[fd].uf_file->f_vnode->v_path != NULL ?
cleanpath(curufinfo.fi_list[fd].uf_file->f_vnode->v_path) :
"<unknown>") :
"<unknown>") :
"<unknown>");
fbt::copen:entry
/curpsinfo->pr_projid == $1/
{
self->opentime = walltimestamp;
self->name = arg1;
self->mode = !(args[2] & FREAD) ? ''W'' : (args[2] ==
FREAD ? ''R'' : ''M'');
}
fbt::copen:return
/self->opentime && arg1 >= 0/
{
printf("%d\t%c O %s %s %s\n",
self->opentime, self->mode, copyinstr(self->name),
cwd, curargs);
}
fbt::copen:return
/self->opentime/
{
self->opentime = 0;
self->name = 0;
self->mode = 0;
}
fbt::closeandsetf:entry
/curpsinfo->pr_projid == $1/
{
printf("%d\tC C %s %s %s\n",
walltimestamp, f_name[arg0], cwd, curargs);
}
----------------------------------------------------------------------
- akolb
On Fri, Sep 21, 2007 at 07:36:20PM -0700, Alexander Kolbasov wrote:> I am running a small script (attached) that timestamps file opens and > closes. > > When I am running it with real load I get tons of messages like > > dtrace: 79575 drops on CPU 12 > dtrace: 140709 drops on CPU 13 > dtrace: 125092 drops on CPU 14 > dtrace: 38129 drops on CPU 15 > ... > > and sometimes > > dtrace: error on enabled probe ID 4 (ID 10200: fbt:genunix:closeandsetf:entry): > out of scratch space in action #2 at DIF offset 508 > dtrace: error on enabled probe ID 2 (ID 6227: fbt:genunix:copen:return): out of > scratch space in action #3 at DIF offset 28 > dtrace: error on enabled probe ID 4 (ID 10200: fbt:genunix:closeandsetf:entry): > out of scratch space in action #2 at DIF offset 508 > dtrace: error on enabled probe ID 4 (ID 10200: fbt:genunix:closeandsetf:entry): > out of scratch space in action #2 at DIF offset 508 > dtrace: error on enabled probe ID 4 (ID 10200: fbt:genunix:closeandsetf:entry): > out of scratch space in action #2 at DIF offset 508 > > Unfortunately DTrace is not telling what are these drops so I don''t know > what limit to tune (and why they are happening in the first place).Both of these messages are symptoms of the same problem: you''re running out of principal buffer space. You can deal with this by either increasing the buffer size (e.g. -x bufsize=20m), increasing the switch rate (e.g. -x switchrate=123hz) or some combination of both. - Bryan -------------------------------------------------------------------------- Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc
> > Unfortunately DTrace is not telling what are these drops so I don''t know > > what limit to tune (and why they are happening in the first place). > > Both of these messages are symptoms of the same problem: you''re running > out of principal buffer space. You can deal with this by either increasing > the buffer size (e.g. -x bufsize=20m), increasing the switch rate (e.g. > -x switchrate=123hz) or some combination of both. >It seems that the largest bufsize I can get is 16m (otherwise I get dtrace: could not enable tracing: File too large message. I tried with 16m bufsize and tried switchrate of 123, 257 and 1025hz and I still see lots of drops. Is there any other magic knob I can try? - akolb> On Fri, Sep 21, 2007 at 07:36:20PM -0700, Alexander Kolbasov wrote: > > I am running a small script (attached) that timestamps file opens and > > closes. > > > > When I am running it with real load I get tons of messages like > > > > dtrace: 79575 drops on CPU 12 > > dtrace: 140709 drops on CPU 13 > > dtrace: 125092 drops on CPU 14 > > dtrace: 38129 drops on CPU 15 > > ... > > > > and sometimes > > > > dtrace: error on enabled probe ID 4 (ID 10200: fbt:genunix:closeandsetf:entry): > > out of scratch space in action #2 at DIF offset 508 > > dtrace: error on enabled probe ID 2 (ID 6227: fbt:genunix:copen:return): out of > > scratch space in action #3 at DIF offset 28 > > dtrace: error on enabled probe ID 4 (ID 10200: fbt:genunix:closeandsetf:entry): > > out of scratch space in action #2 at DIF offset 508 > > dtrace: error on enabled probe ID 4 (ID 10200: fbt:genunix:closeandsetf:entry): > > out of scratch space in action #2 at DIF offset 508 > > dtrace: error on enabled probe ID 4 (ID 10200: fbt:genunix:closeandsetf:entry): > > out of scratch space in action #2 at DIF offset 508 > > > > Unfortunately DTrace is not telling what are these drops so I don''t know > > what limit to tune (and why they are happening in the first place). > > Both of these messages are symptoms of the same problem: you''re running > out of principal buffer space. You can deal with this by either increasing > the buffer size (e.g. -x bufsize=20m), increasing the switch rate (e.g. > -x switchrate=123hz) or some combination of both. > > - Bryan > > -------------------------------------------------------------------------- > Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc >
On Mon, Sep 24, 2007 at 04:51:24PM -0700, Alexander Kolbasov wrote:> > > Unfortunately DTrace is not telling what are these drops so I don''t know > > > what limit to tune (and why they are happening in the first place). > > > > Both of these messages are symptoms of the same problem: you''re running > > out of principal buffer space. You can deal with this by either increasing > > the buffer size (e.g. -x bufsize=20m), increasing the switch rate (e.g. > > -x switchrate=123hz) or some combination of both. > > > > It seems that the largest bufsize I can get is 16m (otherwise I get > > dtrace: could not enable tracing: File too largeThat''s a surprisingly low number. Are there a tremendous number of CPUs on this system? (If this is a DEBUG kernel, try non-DEBUG -- kmem debugging takes up a ton of wired-down memory.)> message. I tried with 16m bufsize and tried switchrate of 123, 257 and 1025hz > and I still see lots of drops. Is there any other magic knob I can try?Are the numbers going down at least? And is there any way that you can use predicates and/or aggregations to prune or aggregate the data flow? - Bryan -------------------------------------------------------------------------- Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc
> On Mon, Sep 24, 2007 at 04:51:24PM -0700, Alexander Kolbasov wrote: > > > > Unfortunately DTrace is not telling what are these drops so I don''t know > > > > what limit to tune (and why they are happening in the first place). > > > > > > Both of these messages are symptoms of the same problem: you''re running > > > out of principal buffer space. You can deal with this by either increasing > > > the buffer size (e.g. -x bufsize=20m), increasing the switch rate (e.g. > > > -x switchrate=123hz) or some combination of both. > > > > > > > It seems that the largest bufsize I can get is 16m (otherwise I get > > > > dtrace: could not enable tracing: File too large > > That''s a surprisingly low number. Are there a tremendous number of CPUs > on this system? (If this is a DEBUG kernel, try non-DEBUG -- kmem debugging > takes up a ton of wired-down memory.)this is 16-CPU x86 system, running non-DEBUG snv_73.> > message. I tried with 16m bufsize and tried switchrate of 123, 257 and 1025hz > > and I still see lots of drops. Is there any other magic knob I can try? > > Are the numbers going down at least? And is there any way that you can > use predicates and/or aggregations to prune or aggregate the data flow?Difficult to say whether they are going down. It seems that it only happens when this one is enabled: fbt::closeandsetf:entry /curpsinfo->pr_projid == $1/ { printf("%d\tC C %s %s %s\n", walltimestamp, fds[arg0].fi_pathname, cwd, curargs); } Once I comment out the printf drops disappear. I am trying to debug race conditions during ON build where I''d like to get timestamps when a file is open and closed. I can try using global arrays to keep track of open files, but this seems even more prone to issues. - akolb
Hi,
Same here when using the dvm agent and Jdk 1.5. Getting a lot of drops using a
Thinkpad T43. Trying to set bufsize even up to limit 128m wont help, nor
specifying a switchrate. I have tried both option without success.
Im running on:
Solaris Nevada snv_64a X86.
Im trying to debug, as an example, the JAlbum, a java based application. This is
a simple script:
dvm$target:::method-entry
{
self->ts[copyinstr(arg0),copyinstr(arg1)] = vtimestamp;
}
dvm$target:::method-return
{
@ts[copyinstr(arg0),copyinstr(arg1)] = sum(vtimestamp -
self->ts[copyinstr(arg0),copyinstr(arg1)]);
}
$ dtrace -x bufsize=128m -s getime-methods.d -p 3447
dtrace: script ''getime-methods.d'' matched 2 probes
dtrace: 2330 dynamic variable drops
dtrace: 7385 dynamic variable drops
dtrace: 9162 dynamic variable drops
dtrace: 10928 dynamic variable drops
dtrace: 10552 dynamic variable drops
dtrace: 6594 dynamic variable drops
dtrace: 2853 dynamic variable drops
^C
thanks,
stefan
--
This message posted from opensolaris.org
brrr, looks like I missed my coffee: I was talking about the dynamic variable pool. Brrr, off topic. Had increased that to 32m and worked fine. stefan -- This message posted from opensolaris.org
>From what I understand on this thread :http://forum.java.sun.com/thread.jspa?threadID=5133053&tstart=30 there is a hardcoded limit in DTrace of 16M for the buffer size when you are not running as root. Why, why, why ?? -- This message posted from opensolaris.org
On Mon, Apr 21, 2008 at 12:33:47AM -0700, Lars Bruun-Hansen wrote:> there is a hardcoded limit in DTrace of 16M for the buffer size when > you are not running as root. Why, why, why ??We don''t want to let non-root users exhaust system resources. A root user can change this by modifying dtrace_nonroot_maxsize via ''mdb -k'' or by adding a set directive to /etc/system (check out system(4)). Adam -- Adam Leventhal, Fishworks http://blogs.sun.com/ahl