thr3ads.net - dtrace discuss - [dtrace-discuss] Getting around tail-calls... [Jul 2005]

If this information is useful, please help other people find it:
Share via:

Rich Brown

2005-Jul-28 22:53 UTC

[dtrace-discuss] Getting around tail-calls...

Hi All,

I''m looking for some advice:

I want to use Dtrace to observe the latency and results of all vnode
operations. This is something we want to do in production.

For example, I''d like to run a Dscript using the fop_read:entry/return
FBT probes to save uio_resid on entry, subtract off uio_resid after the
vop call returns then display the number bytes transferred.  Throw in a
little vtimestamp subtraction and I''ve got latency also.

The trouble is that most of the fop calls are tail-call optimized which
(as you well know) makes the return probe fire before the
(*(vp)->v_op->vop_read)(vp, uiop, ioflag, cr, ct); happens.  I''ve
tried
the following:

- I followed the suggestions in the Dtrace cookbook for dealing with
 tail-call optimized routines:   http://dtrace.eng/recipes/chasetail.html .   
 After the 100K+ probes are enabled, there is a significant performance hit
 on the system. (And I''ll want to use this on systems that are already
 having problems.)

- I looked at our production compilers and noticed that they have lower
 levels of optimization (e.g., -xO1) that won''t do tail-call
optimization,
 but seems so wrong way on several levels:  It''s (a) compiler-specific
and
 (b) too  big of a hammer (even just for vnode.c).

- I added a static probe after the vop call.  This works and has good
 observed performance, but all I''m really doing is subverting the
tail-call
 optimization. Seems like there must be a better way.

Any suggestions?  Thanks for your help,

  Rich

Jim Mauro

2005-Jul-29 01:13 UTC

head link

[dtrace-discuss] Getting around tail-calls...

Did you check out Richard McDougall''s VFS tracing script?

http://blogs.sun.com/roller/page/rmc/Weblog?catname=%2FDTrace

/jim


Rich Brown wrote:
> Hi All,
>
> I''m looking for some advice:
>
> I want to use Dtrace to observe the latency and results of all vnode
> operations. This is something we want to do in production.
>
> For example, I''d like to run a Dscript using the
fop_read:entry/return
> FBT probes to save uio_resid on entry, subtract off uio_resid after the
> vop call returns then display the number bytes transferred.  Throw in a
> little vtimestamp subtraction and I''ve got latency also.
>
> The trouble is that most of the fop calls are tail-call optimized which
> (as you well know) makes the return probe fire before the
> (*(vp)->v_op->vop_read)(vp, uiop, ioflag, cr, ct); happens. 
I''ve tried
> the following:
>
> - I followed the suggestions in the Dtrace cookbook for dealing with
> tail-call optimized routines:   
> http://dtrace.eng/recipes/chasetail.html .   After the 100K+ probes 
> are enabled, there is a significant performance hit
> on the system. (And I''ll want to use this on systems that are
already
> having problems.)
>
> - I looked at our production compilers and noticed that they have lower
> levels of optimization (e.g., -xO1) that won''t do tail-call
optimization,
> but seems so wrong way on several levels:  It''s (a)
compiler-specific and
> (b) too  big of a hammer (even just for vnode.c).
>
> - I added a static probe after the vop call.  This works and has good
> observed performance, but all I''m really doing is subverting the 
> tail-call
> optimization. Seems like there must be a better way.
>
> Any suggestions?  Thanks for your help,
>
>  Rich
>
>
>
>
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org

Rich Brown

2005-Jul-29 13:34 UTC

head link

[dtrace-discuss] Getting around tail-calls...

Hi Jim,

Thanks for the pointer.  It gives the information on what
the vop intends to do plus any actual I/O performed, but
I''m looking for the results of the vop itself.  Here''s an
example of a script that gives me exactly the output I''m
looking for, but only on a system that doesn''t optimize
the tail-call for fop routines.

If I use the following Dscript:

--------------------- cut -------------------------
#!/usr/sbin/dtrace -Cqs

#include <sys/uio.h>

fop_read:entry
/ execname != "dtrace" && ((uio_t *)arg1)->uio_resid != 0 /
{
    self->tstart = vtimestamp;
    self->path =  ((vnode_t *)arg0)->v_path,
    self->resid = ((uio_t *)arg1)->uio_resid;
    self->uiop = arg1;
    self->offset = ((uio_t *)arg1)->uio_offset;
}

fop_read:return
/ self->resid /
{
    printf("READ of %s at %d for %ld bytes in %d usec\n",
        stringof(self->path),
        self->offset,
        (self->resid - ((uio_t *)self->uiop)->uio_resid),
        (vtimestamp - self->tstart)/1000);
}
--------------------- cut -------------------------

And do:  dd if=/kernel/genunix of=/dev/null bs=1024k count=1

Then I see the following on an Opteron box running Solaris:

READ of /dev/pts/../../devices/pseudo/pts at 0:10 at 7838 for 1 bytes in 7 usec
READ of /usr/bin/dd at 0 for 8 bytes in 10 usec
READ of /usr/bin/dd at 0 for 52 bytes in 1 usec
READ of /usr/bin/dd at 52 for 192 bytes in 1 usec
READ of /usr/bin/dd at 244 for 17 bytes in 1 usec
READ of /lib/ld.so.1 at 0 for 52 bytes in 4 usec
READ of /lib/ld.so.1 at 52 for 160 bytes in 1 usec
READ of /kernel/genunix at 0 for 1048576 bytes in 2479 usec

But if I do this on a SPARC box, I get 0 for the bytes transferred
and 0 for the latency because the ''return'' probe fires before
the
vop call has been made.

Thanks,

    Rich


Jim Mauro wrote:
>
> Did you check out Richard McDougall''s VFS tracing script?
>
> http://blogs.sun.com/roller/page/rmc/Weblog?catname=%2FDTrace
>
> /jim
>
>
> Rich Brown wrote:
>
>> Hi All,
>>
>> I''m looking for some advice:
>>
>> I want to use Dtrace to observe the latency and results of all vnode
>> operations. This is something we want to do in production.
>>
>> For example, I''d like to run a Dscript using the
fop_read:entry/return
>> FBT probes to save uio_resid on entry, subtract off uio_resid after the
>> vop call returns then display the number bytes transferred.  Throw in a
>> little vtimestamp subtraction and I''ve got latency also.
>>
>> The trouble is that most of the fop calls are tail-call optimized which
>> (as you well know) makes the return probe fire before the
>> (*(vp)->v_op->vop_read)(vp, uiop, ioflag, cr, ct); happens. 
I''ve tried
>> the following:
>>
>> - I followed the suggestions in the Dtrace cookbook for dealing with
>> tail-call optimized routines:   
>> http://dtrace.eng/recipes/chasetail.html .   After the 100K+ probes 
>> are enabled, there is a significant performance hit
>> on the system. (And I''ll want to use this on systems that are
already
>> having problems.)
>>
>> - I looked at our production compilers and noticed that they have lower
>> levels of optimization (e.g., -xO1) that won''t do tail-call 
>> optimization,
>> but seems so wrong way on several levels:  It''s (a)
compiler-specific
>> and
>> (b) too  big of a hammer (even just for vnode.c).
>>
>> - I added a static probe after the vop call.  This works and has good
>> observed performance, but all I''m really doing is subverting
the
>> tail-call
>> optimization. Seems like there must be a better way.
>>
>> Any suggestions?  Thanks for your help,
>>
>>  Rich
>>
>>
>>
>>
>> _______________________________________________
>> dtrace-discuss mailing list
>> dtrace-discuss at opensolaris.org
>
>
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org

James Mauro

2005-Jul-29 15:44 UTC

head link

[dtrace-discuss] Getting around tail-calls...

Got it Rich, thanks.

Cool script. Unfortunately, I have no further
suggestions on getting around your specific problem.

Best,
/jim


Rich Brown wrote:
> Hi Jim,
>
> Thanks for the pointer.  It gives the information on what
> the vop intends to do plus any actual I/O performed, but
> I''m looking for the results of the vop itself.  Here''s an
> example of a script that gives me exactly the output I''m
> looking for, but only on a system that doesn''t optimize
> the tail-call for fop routines.
>
> If I use the following Dscript:
>
> --------------------- cut -------------------------
> #!/usr/sbin/dtrace -Cqs
>
> #include <sys/uio.h>
>
> fop_read:entry
> / execname != "dtrace" && ((uio_t *)arg1)->uio_resid
!= 0 /
> {
>    self->tstart = vtimestamp;
>    self->path =  ((vnode_t *)arg0)->v_path,
>    self->resid = ((uio_t *)arg1)->uio_resid;
>    self->uiop = arg1;
>    self->offset = ((uio_t *)arg1)->uio_offset;
> }
>
> fop_read:return
> / self->resid /
> {
>    printf("READ of %s at %d for %ld bytes in %d usec\n",
>        stringof(self->path),
>        self->offset,
>        (self->resid - ((uio_t *)self->uiop)->uio_resid),
>        (vtimestamp - self->tstart)/1000);
> }
> --------------------- cut -------------------------
>
> And do:  dd if=/kernel/genunix of=/dev/null bs=1024k count=1
>
> Then I see the following on an Opteron box running Solaris:
>
> READ of /dev/pts/../../devices/pseudo/pts at 0:10 at 7838 for 1 bytes in 
> 7 usec
> READ of /usr/bin/dd at 0 for 8 bytes in 10 usec
> READ of /usr/bin/dd at 0 for 52 bytes in 1 usec
> READ of /usr/bin/dd at 52 for 192 bytes in 1 usec
> READ of /usr/bin/dd at 244 for 17 bytes in 1 usec
> READ of /lib/ld.so.1 at 0 for 52 bytes in 4 usec
> READ of /lib/ld.so.1 at 52 for 160 bytes in 1 usec
> READ of /kernel/genunix at 0 for 1048576 bytes in 2479 usec
>
> But if I do this on a SPARC box, I get 0 for the bytes transferred
> and 0 for the latency because the ''return'' probe fires
before the
> vop call has been made.
>
> Thanks,
>
>    Rich
>
>
> Jim Mauro wrote:
>
>>
>> Did you check out Richard McDougall''s VFS tracing script?
>>
>> http://blogs.sun.com/roller/page/rmc/Weblog?catname=%2FDTrace
>>
>> /jim
>>
>>
>> Rich Brown wrote:
>>
>>> Hi All,
>>>
>>> I''m looking for some advice:
>>>
>>> I want to use Dtrace to observe the latency and results of all
vnode
>>> operations. This is something we want to do in production.
>>>
>>> For example, I''d like to run a Dscript using the
fop_read:entry/return
>>> FBT probes to save uio_resid on entry, subtract off uio_resid after
the
>>> vop call returns then display the number bytes transferred.  Throw
in a
>>> little vtimestamp subtraction and I''ve got latency also.
>>>
>>> The trouble is that most of the fop calls are tail-call optimized
which
>>> (as you well know) makes the return probe fire before the
>>> (*(vp)->v_op->vop_read)(vp, uiop, ioflag, cr, ct); happens. 
I''ve tried
>>> the following:
>>>
>>> - I followed the suggestions in the Dtrace cookbook for dealing
with
>>> tail-call optimized routines:   
>>> http://dtrace.eng/recipes/chasetail.html .   After the 100K+ probes
>>> are enabled, there is a significant performance hit
>>> on the system. (And I''ll want to use this on systems that
are already
>>> having problems.)
>>>
>>> - I looked at our production compilers and noticed that they have
lower
>>> levels of optimization (e.g., -xO1) that won''t do
tail-call
>>> optimization,
>>> but seems so wrong way on several levels:  It''s (a) 
>>> compiler-specific and
>>> (b) too  big of a hammer (even just for vnode.c).
>>>
>>> - I added a static probe after the vop call.  This works and has
good
>>> observed performance, but all I''m really doing is
subverting the
>>> tail-call
>>> optimization. Seems like there must be a better way.
>>>
>>> Any suggestions?  Thanks for your help,
>>>
>>>  Rich
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> dtrace-discuss mailing list
>>> dtrace-discuss at opensolaris.org
>>
>>
>>
>> _______________________________________________
>> dtrace-discuss mailing list
>> dtrace-discuss at opensolaris.org
>
>
>

Roch Bourbonnais - Performance Engineering

2005-Aug-03 10:54 UTC

head link

[dtrace-discuss] Getting around tail-calls...

You could spell out the *fs_read of interest ?
Include the -Z option in case a given FS is not installed.


  ufs_read:entry,
  zfs_read:entry,
  qfs_read:entry,
  vxfs_read:entry,
  nfs3_read:entry,
  nfs4_read:entry...
  / execname != "dtrace" && ((uio_t *)arg1)->uio_resid != 0
/

It won''t be too ugly.

-r

Rich Brown writes:
 > Hi Jim,
 > 
 > Thanks for the pointer.  It gives the information on what
 > the vop intends to do plus any actual I/O performed, but
 > I''m looking for the results of the vop itself.  Here''s
an
 > example of a script that gives me exactly the output I''m
 > looking for, but only on a system that doesn''t optimize
 > the tail-call for fop routines.
 > 
 > If I use the following Dscript:
 > 
 > --------------------- cut -------------------------
 > #!/usr/sbin/dtrace -Cqs
 > 
 > #include <sys/uio.h>
 > 
 > fop_read:entry
 > / execname != "dtrace" && ((uio_t *)arg1)->uio_resid
!= 0 /
 > {
 >     self->tstart = vtimestamp;
 >     self->path =  ((vnode_t *)arg0)->v_path,
 >     self->resid = ((uio_t *)arg1)->uio_resid;
 >     self->uiop = arg1;
 >     self->offset = ((uio_t *)arg1)->uio_offset;
 > }
 > 
 > fop_read:return
 > / self->resid /
 > {
 >     printf("READ of %s at %d for %ld bytes in %d usec\n",
 >         stringof(self->path),
 >         self->offset,
 >         (self->resid - ((uio_t *)self->uiop)->uio_resid),
 >         (vtimestamp - self->tstart)/1000);
 > }
 > --------------------- cut -------------------------
 > 
 > And do:  dd if=/kernel/genunix of=/dev/null bs=1024k count=1
 > 
 > Then I see the following on an Opteron box running Solaris:
 > 
 > READ of /dev/pts/../../devices/pseudo/pts at 0:10 at 7838 for 1 bytes in 7
usec
 > READ of /usr/bin/dd at 0 for 8 bytes in 10 usec
 > READ of /usr/bin/dd at 0 for 52 bytes in 1 usec
 > READ of /usr/bin/dd at 52 for 192 bytes in 1 usec
 > READ of /usr/bin/dd at 244 for 17 bytes in 1 usec
 > READ of /lib/ld.so.1 at 0 for 52 bytes in 4 usec
 > READ of /lib/ld.so.1 at 52 for 160 bytes in 1 usec
 > READ of /kernel/genunix at 0 for 1048576 bytes in 2479 usec
 > 
 > But if I do this on a SPARC box, I get 0 for the bytes transferred
 > and 0 for the latency because the ''return'' probe fires
before the
 > vop call has been made.
 > 
 > Thanks,
 > 
 >     Rich
 > 
 > 
 > Jim Mauro wrote:
 > 
 > >
 > > Did you check out Richard McDougall''s VFS tracing script?
 > >
 > > http://blogs.sun.com/roller/page/rmc/Weblog?catname=%2FDTrace
 > >
 > > /jim
 > >
 > >
 > > Rich Brown wrote:
 > >
 > >> Hi All,
 > >>
 > >> I''m looking for some advice:
 > >>
 > >> I want to use Dtrace to observe the latency and results of all
vnode
 > >> operations. This is something we want to do in production.
 > >>
 > >> For example, I''d like to run a Dscript using the
fop_read:entry/return
 > >> FBT probes to save uio_resid on entry, subtract off uio_resid
after the
 > >> vop call returns then display the number bytes transferred. 
Throw in a
 > >> little vtimestamp subtraction and I''ve got latency also.
 > >>
 > >> The trouble is that most of the fop calls are tail-call optimized
which
 > >> (as you well know) makes the return probe fire before the
 > >> (*(vp)->v_op->vop_read)(vp, uiop, ioflag, cr, ct); happens.
I''ve tried
 > >> the following:
 > >>
 > >> - I followed the suggestions in the Dtrace cookbook for dealing
with
 > >> tail-call optimized routines:   
 > >> http://dtrace.eng/recipes/chasetail.html .   After the 100K+
probes
 > >> are enabled, there is a significant performance hit
 > >> on the system. (And I''ll want to use this on systems
that are already
 > >> having problems.)
 > >>
 > >> - I looked at our production compilers and noticed that they have
lower
 > >> levels of optimization (e.g., -xO1) that won''t do
tail-call
 > >> optimization,
 > >> but seems so wrong way on several levels:  It''s (a)
compiler-specific
 > >> and
 > >> (b) too  big of a hammer (even just for vnode.c).
 > >>
 > >> - I added a static probe after the vop call.  This works and has
good
 > >> observed performance, but all I''m really doing is
subverting the
 > >> tail-call
 > >> optimization. Seems like there must be a better way.
 > >>
 > >> Any suggestions?  Thanks for your help,
 > >>
 > >>  Rich
 > >>
 > >>
 > >>
 > >>
 > >> _______________________________________________
 > >> dtrace-discuss mailing list
 > >> dtrace-discuss at opensolaris.org
 > >
 > >
 > > _______________________________________________
 > > dtrace-discuss mailing list
 > > dtrace-discuss at opensolaris.org
 > 
 > 
 > _______________________________________________
 > dtrace-discuss mailing list
 > dtrace-discuss at opensolaris.org

Rich Brown

2005-Aug-03 14:28 UTC

head link

[dtrace-discuss] Getting around tail-calls...

Hi Roch,

I was thinking of something similar recently.  I didn''t know about the
-Z option which makes this viable.  Thanks! (David Robinson and I
were discussing various ways to discover what vnode operations might
be available.  That can get ugly pretty quickly.)

My goal is still to do this from the fop_* level.   One of the reasons
those calls were added was to give us a file system independent place
to stand to do just these kinds of things.  I''ve had performance runs
(PerfPIT) done on a build with static probes added to fop_read,
fop_write, fop_getattr.  So far, there are no statistically significant
regressions in the results.

If this doesn''t pan out, your suggestion is at the top of my list.

Thanks,

    Rich


Roch Bourbonnais - Performance Engineering wrote:
>You could spell out the *fs_read of interest ?
>Include the -Z option in case a given FS is not installed.
>
>
>  ufs_read:entry,
>  zfs_read:entry,
>  qfs_read:entry,
>  vxfs_read:entry,
>  nfs3_read:entry,
>  nfs4_read:entry...
>  / execname != "dtrace" && ((uio_t *)arg1)->uio_resid
!= 0 /
>
>It won''t be too ugly.
>
>-r
>
>Rich Brown writes:
> > Hi Jim,
> > 
> > Thanks for the pointer.  It gives the information on what
> > the vop intends to do plus any actual I/O performed, but
> > I''m looking for the results of the vop itself. 
Here''s an
> > example of a script that gives me exactly the output I''m
> > looking for, but only on a system that doesn''t optimize
> > the tail-call for fop routines.
> > 
> > If I use the following Dscript:
> > 
> > --------------------- cut -------------------------
> > #!/usr/sbin/dtrace -Cqs
> > 
> > #include <sys/uio.h>
> > 
> > fop_read:entry
> > / execname != "dtrace" && ((uio_t
*)arg1)->uio_resid != 0 /
> > {
> >     self->tstart = vtimestamp;
> >     self->path =  ((vnode_t *)arg0)->v_path,
> >     self->resid = ((uio_t *)arg1)->uio_resid;
> >     self->uiop = arg1;
> >     self->offset = ((uio_t *)arg1)->uio_offset;
> > }
> > 
> > fop_read:return
> > / self->resid /
> > {
> >     printf("READ of %s at %d for %ld bytes in %d usec\n",
> >         stringof(self->path),
> >         self->offset,
> >         (self->resid - ((uio_t *)self->uiop)->uio_resid),
> >         (vtimestamp - self->tstart)/1000);
> > }
> > --------------------- cut -------------------------
> > 
> > And do:  dd if=/kernel/genunix of=/dev/null bs=1024k count=1
> > 
> > Then I see the following on an Opteron box running Solaris:
> > 
> > READ of /dev/pts/../../devices/pseudo/pts at 0:10 at 7838 for 1 bytes
in 7 usec
> > READ of /usr/bin/dd at 0 for 8 bytes in 10 usec
> > READ of /usr/bin/dd at 0 for 52 bytes in 1 usec
> > READ of /usr/bin/dd at 52 for 192 bytes in 1 usec
> > READ of /usr/bin/dd at 244 for 17 bytes in 1 usec
> > READ of /lib/ld.so.1 at 0 for 52 bytes in 4 usec
> > READ of /lib/ld.so.1 at 52 for 160 bytes in 1 usec
> > READ of /kernel/genunix at 0 for 1048576 bytes in 2479 usec
> > 
> > But if I do this on a SPARC box, I get 0 for the bytes transferred
> > and 0 for the latency because the ''return'' probe
fires before the
> > vop call has been made.
> > 
> > Thanks,
> > 
> >     Rich
> > 
> > 
> > Jim Mauro wrote:
> > 
> > >
> > > Did you check out Richard McDougall''s VFS tracing
script?
> > >
> > > http://blogs.sun.com/roller/page/rmc/Weblog?catname=%2FDTrace
> > >
> > > /jim
> > >
> > >
> > > Rich Brown wrote:
> > >
> > >> Hi All,
> > >>
> > >> I''m looking for some advice:
> > >>
> > >> I want to use Dtrace to observe the latency and results of
all vnode
> > >> operations. This is something we want to do in production.
> > >>
> > >> For example, I''d like to run a Dscript using the
fop_read:entry/return
> > >> FBT probes to save uio_resid on entry, subtract off uio_resid
after the
> > >> vop call returns then display the number bytes transferred. 
Throw in a
> > >> little vtimestamp subtraction and I''ve got latency
also.
> > >>
> > >> The trouble is that most of the fop calls are tail-call
optimized which
> > >> (as you well know) makes the return probe fire before the
> > >> (*(vp)->v_op->vop_read)(vp, uiop, ioflag, cr, ct);
happens.  I''ve tried
> > >> the following:
> > >>
> > >> - I followed the suggestions in the Dtrace cookbook for
dealing with
> > >> tail-call optimized routines:   
> > >> http://dtrace.eng/recipes/chasetail.html .   After the 100K+
probes
> > >> are enabled, there is a significant performance hit
> > >> on the system. (And I''ll want to use this on systems
that are already
> > >> having problems.)
> > >>
> > >> - I looked at our production compilers and noticed that they
have lower
> > >> levels of optimization (e.g., -xO1) that won''t do
tail-call
> > >> optimization,
> > >> but seems so wrong way on several levels:  It''s (a)
compiler-specific
> > >> and
> > >> (b) too  big of a hammer (even just for vnode.c).
> > >>
> > >> - I added a static probe after the vop call.  This works and
has good
> > >> observed performance, but all I''m really doing is
subverting the
> > >> tail-call
> > >> optimization. Seems like there must be a better way.
> > >>
> > >> Any suggestions?  Thanks for your help,
> > >>
> > >>  Rich
> > >>
> > >>
> > >>
> > >>
> > >> _______________________________________________
> > >> dtrace-discuss mailing list
> > >> dtrace-discuss at opensolaris.org
> > >
> > >
> > > _______________________________________________
> > > dtrace-discuss mailing list
> > > dtrace-discuss at opensolaris.org
> > 
> > 
> > _______________________________________________
> > dtrace-discuss mailing list
> > dtrace-discuss at opensolaris.org
>
>  
>

dtrace discuss - Jul 2005 - Getting around tail-calls...

[dtrace-discuss] Getting around tail-calls...

[dtrace-discuss] Getting around tail-calls...

[dtrace-discuss] Getting around tail-calls...

[dtrace-discuss] Getting around tail-calls...

[dtrace-discuss] Getting around tail-calls...

[dtrace-discuss] Getting around tail-calls...