thr3ads.net - dtrace discuss - [dtrace-discuss] DTrace provider for NFS [Jan 2006]

If this information is useful, please help other people find it:
Share via:

Sam Falkner

2006-Jan-02 16:48 UTC

[dtrace-discuss] DTrace provider for NFS

FYI, I posted a blog a few days ago about a DTrace provider for NFS  
that is currently in
development:

http://blogs.sun.com/roller/page/samf?entry=a_dtrace_provider_for_nfs

Let''s discuss any questions, comments, etc. here.  I also advertised  
this on
nfs-discuss at opensolaris.org.  Naturally, I would expect the  
discussion here to
be more on the specifics of DTrace, and the discussion on nfs-discuss  
to be
more about NFS.  Feel free to join either or both discussions.

- Sam

Nicolas Williams

2006-Jan-02 17:44 UTC

head link

[dtrace-discuss] DTrace provider for NFS

On Mon, Jan 02, 2006 at 09:48:08AM -0700, Sam Falkner
wrote:> FYI, I posted a blog a few days ago about a DTrace provider for NFS  
> that is currently in
> development:
> 
> http://blogs.sun.com/roller/page/samf?entry=a_dtrace_provider_for_nfs
> 
> Let''s discuss any questions, comments, etc. here.  I also
advertised
> this on
> nfs-discuss at opensolaris.org.  Naturally, I would expect the  
> discussion here to
> be more on the specifics of DTrace, and the discussion on nfs-discuss  
> to be
> more about NFS.  Feel free to join either or both discussions.
This is way cool!

Things to add to args[0]:

 - cred_t (server-side)
 - RPC credentials:
    - RPCSEC_GSS info:
       - RPCSEC_GSS handle
       - sec triple (mechanism OID, QoP, GSS protection services)
       - client principal display name
       - server principal display name

The actual buffer containing an operation''s octets would be nice.

Nico
--

Sam Falkner

2006-Jan-02 19:55 UTC

head link

[dtrace-discuss] DTrace provider for NFS

On Jan 2, 2006, at 10:44 AM, Nicolas Williams wrote:
> On Mon, Jan 02, 2006 at 09:48:08AM -0700, Sam Falkner wrote:
>> FYI, I posted a blog a few days ago about a DTrace provider for NFS
>> that is currently in
>> development:
>>
>> http://blogs.sun.com/roller/page/samf?entry=a_dtrace_provider_for_nfs
>>
>> Let''s discuss any questions, comments, etc. here.  I also
advertised
>> this on
>> nfs-discuss at opensolaris.org.  Naturally, I would expect the
>> discussion here to
>> be more on the specifics of DTrace, and the discussion on nfs-discuss
>> to be
>> more about NFS.  Feel free to join either or both discussions.
>
> This is way cool!
Thanks!
> Things to add to args[0]:
>
>  - cred_t (server-side)
>  - RPC credentials:
>     - RPCSEC_GSS info:
>        - RPCSEC_GSS handle
>        - sec triple (mechanism OID, QoP, GSS protection services)
>        - client principal display name
>        - server principal display name
Yes -- these seem very good.
> The actual buffer containing an operation''s octets would be nice.
Do you mean "everything sent over the channel", i.e. over-the-wire  
except without any
encryption?  I''ll have a look at how this would be implemented. 
I''ll
let you know...

- Sam

Nicolas Williams

2006-Jan-02 20:11 UTC

head link

[dtrace-discuss] DTrace provider for NFS

On Mon, Jan 02, 2006 at 12:55:43PM -0700, Sam Falkner
wrote:> On Jan 2, 2006, at 10:44 AM, Nicolas Williams wrote:
> >The actual buffer containing an operation''s octets would be
nice.
> 
> Do you mean "everything sent over the channel", i.e.
over-the-wire
> except without any
> encryption?  I''ll have a look at how this would be implemented. 
I''ll
> let you know...
You can get at said octets in the clear because where you''d get at them
you''re above RPCSEC_GSS (and well above SSHv2 and/or ESP/AH [IPsec]).

So, getting the data in the clear is not the problem so much as finding
the right places to put the probes.  On the client-side rfs4call() looks
like the right place.  On the server-side rfs4_compound() looks like the
right place.

And rfs4_compound() is where you''d get the cred_t, client principal and
sec triple info.

Nico
--

Nicolas Williams

2006-Jan-02 20:33 UTC

head link

[dtrace-discuss] DTrace provider for NFS

On Mon, Jan 02, 2006 at 02:11:37PM -0600, Nicolas Williams
wrote:> On Mon, Jan 02, 2006 at 12:55:43PM -0700, Sam Falkner wrote:
> > On Jan 2, 2006, at 10:44 AM, Nicolas Williams wrote:
> > >The actual buffer containing an operation''s octets would
be nice.
> > 
> > Do you mean "everything sent over the channel", i.e.
over-the-wire
> > except without any
> > encryption?  I''ll have a look at how this would be
implemented.  I''ll
> > let you know...
And BTW, one reason for this is so you can see the actual RPCs in the
clear even if you''re using privacy protection.  Hooks can be provided
so
that privacy protected RPCs can be decoded properly (e.g., ethereal has
functionality of this sort for Kerberos V, though you''d need better
hooks than what it uses if the mechanism provided PFS) -- but it should
be easier to just get the cleartext from the right place via dtrace, no?

Or is this just not a problem because privacy or no privacy should make
no difference in the otw ops?  You could always debug without privacy
protection I suppose...

Sam Falkner

2006-Jan-02 21:09 UTC

head link

[dtrace-discuss] DTrace provider for NFS

On Jan 2, 2006, at 1:33 PM, Nicolas Williams wrote:
> On Mon, Jan 02, 2006 at 02:11:37PM -0600, Nicolas Williams wrote:
>> On Mon, Jan 02, 2006 at 12:55:43PM -0700, Sam Falkner wrote:
>>> On Jan 2, 2006, at 10:44 AM, Nicolas Williams wrote:
>>>> The actual buffer containing an operation''s octets
would be nice.
>>>
>>> Do you mean "everything sent over the channel", i.e.
over-the-wire
>>> except without any
>>> encryption?  I''ll have a look at how this would be
implemented.
>>> I''ll
>>> let you know...
>
> And BTW, one reason for this is so you can see the actual RPCs in the
> clear even if you''re using privacy protection.  Hooks can be  
> provided so
> that privacy protected RPCs can be decoded properly (e.g., ethereal  
> has
> functionality of this sort for Kerberos V, though you''d need
better
> hooks than what it uses if the mechanism provided PFS) -- but it  
> should
> be easier to just get the cleartext from the right place via  
> dtrace, no?
>
> Or is this just not a problem because privacy or no privacy should  
> make
> no difference in the otw ops?  You could always debug without privacy
> protection I suppose...
No, not always.  I''ve heard of one story where a problem only occurred
with krb5p, and the poor fellows working on it (you know who you are!)
couldn''t use snoop(1m).

So yeah, this is a great idea.  In fact, one example script I want to  
eventually
write is a DTrace script  that more or less simulates snoop(1m).   
It''ll probably
be NFSv4 only, and not showing any lower layers, but it should be useful
for those times when privacy is in effect.

- Sam

Adam Leventhal

2006-Jan-03 02:07 UTC

head link

[dtrace-discuss] DTrace provider for NFS

Hey Sam,

This is obviously great stuff -- thanks for undertaking it!

This might be a dumb question, but your blog seems to focus exclusively
on NFSv4. Would supporting a legacy (or future) version of NFS require
a new version of the provider? If that''s the case, then perhaps the
provider name should contain a 4; otherwise, it would be good to provide
some documentation on how it might support older version and how it might
be extended to support subsequent versions.

Regarding arguments, I''m not sure that we really want to expose types
such as READ4args which are (I believe) currently just implementation
details. Would it make sense to invent a new structure or just enumerate
the arguments to the operation in args[1..N]?

I''ve always been (and continue to be) a little confused by compound
operations. My understanding is that a compound operation is a single
command that can invoke many operations at one go. I can imagine someone
enabling the nfs::op-create:start probe (or whatever) and drawing the
wrong conclusions because the create operations were mostly part of
compound operations. Assuming I haven''t constructed a completely
farcical
scenario, would it make sense to have a compound operation fire the
op-compound probe as well as any probes for operations collected by that
compound operation?

Presumably this new provider requires some changes to the nfs and nfssrv
kernel modules as well as the addition of a new kernel module for the
provider itself. It would be great if you could release those binaries
and source so we could start to experiment with them on our own systems.

Thanks.

Adam

On Mon, Jan 02, 2006 at 09:48:08AM -0700, Sam Falkner
wrote:> FYI, I posted a blog a few days ago about a DTrace provider for NFS  
> that is currently in
> development:
> 
> http://blogs.sun.com/roller/page/samf?entry=a_dtrace_provider_for_nfs
> 
> Let''s discuss any questions, comments, etc. here.  I also
advertised
> this on
> nfs-discuss at opensolaris.org.  Naturally, I would expect the  
> discussion here to
> be more on the specifics of DTrace, and the discussion on nfs-discuss  
> to be
> more about NFS.  Feel free to join either or both discussions.
> 
> - Sam
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
-- 
Adam Leventhal, Solaris Kernel Development       http://blogs.sun.com/ahl

Sam Falkner

2006-Jan-03 20:17 UTC

head link

[dtrace-discuss] DTrace provider for NFS

On Jan 2, 2006, at 7:07 PM, Adam Leventhal wrote:
> Hey Sam,
>
> This is obviously great stuff -- thanks for undertaking it!
No prob!
> This might be a dumb question, but your blog seems to focus  
> exclusively
> on NFSv4. Would supporting a legacy (or future) version of NFS require
> a new version of the provider? If that''s the case, then perhaps
the
> provider name should contain a 4; otherwise, it would be good to  
> provide
> some documentation on how it might support older version and how it  
> might
> be extended to support subsequent versions.
I wasn''t going to try to support older versions in the near term, but  
I didn''t want to
exclude the possibility either.  I hadn''t really thought about  
whether a different
provider would be a good idea, but now I think that it might.  An NFSv3
DTrace provider''s probes would probably fall into a different pattern  
than v4, so it
should probably be separated into another provider.

For now, I''ll add a "4" to the end of the providers''
names.
> Regarding arguments, I''m not sure that we really want to expose
types
> such as READ4args which are (I believe) currently just implementation
> details. Would it make sense to invent a new structure or just  
> enumerate
> the arguments to the operation in args[1..N]?
One reason I favor a structure is that it looks very difficult to me  
for a DTrace
provider to give more than five arguments.  So OPEN4args, which has six
arguments, would be tough to break into args[1..6].

So I guess there are two questions here.  First, is it a good idea to  
use structures,
and second, if it is a good idea, how do we do it?

Structures such as READ4args are defined by RFC 3530 (the document that
describes NFSv4).  As implemented by OpenSolaris, there may be some  
extra
members in those structures, but someone understanding RFC 3530 could  
just
use the "real" members only.  The extra members of the structures  
would potentially
be useful, but it does raise problems as to their stability.

Of course, translators could easily translate them to the "vanilla  
RFC 3530" versions
of the structures.  Maybe that''s the way to go.

Future minor versions of NFSv4 (e.g. the upcoming 4.1) will introduce  
new
operations, arguments, results, etc.  And future minor versions may even
deprecate old operations (and hence their arguments).  But as long as
OpenSolaris supports NFSv4.0, READ4args should be valid.
> I''ve always been (and continue to be) a little confused by
compound
> operations. My understanding is that a compound operation is a single
> command that can invoke many operations at one go.
Yes, this is right.
> I can imagine someone
> enabling the nfs::op-create:start probe (or whatever) and drawing the
> wrong conclusions because the create operations were mostly part of
> compound operations.
Hopefully this wouldn''t be confusing, but let''s see how it
plays out.
> Assuming I haven''t constructed a completely farcical
> scenario, would it make sense to have a compound operation fire the
> op-compound probe as well as any probes for operations collected by  
> that
> compound operation?
Yes, that''s exactly how it''s implemented!  There''s a
hook in the nfs
kernel module
that corresponds to op-compound.  op-compound drives any enabled  
subordinate
probes, e.g. op-setattr.  Things like op-setattr can further drive  
attr-* (attribute based)
probes, e.g. attr-size.  So, doing a truncate() on the client can  
drive the client to fire
the probes op-compound, op-setattr, and attr-size (and potentially  
many more).
> Presumably this new provider requires some changes to the nfs and  
> nfssrv
> kernel modules as well as the addition of a new kernel module for the
> provider itself. It would be great if you could release those binaries
> and source so we could start to experiment with them on our own  
> systems.
I''ve got three requests (so far) for this.  I''ll be working on
it!  :-)
> Thanks.
Thank you!

- Sam

Adam Leventhal

2006-Jan-03 20:23 UTC

head link

[dtrace-discuss] DTrace provider for NFS

On Tue, Jan 03, 2006 at 01:17:32PM -0700, Sam Falkner
wrote:> >This might be a dumb question, but your blog seems to focus  
> >exclusively
> >on NFSv4. Would supporting a legacy (or future) version of NFS require
> >a new version of the provider? If that''s the case, then
perhaps the
> >provider name should contain a 4; otherwise, it would be good to  
> >provide
> >some documentation on how it might support older version and how it  
> >might
> >be extended to support subsequent versions.
> 
> I wasn''t going to try to support older versions in the near term,
but
> I didn''t want to
> exclude the possibility either.  I hadn''t really thought about  
> whether a different
> provider would be a good idea, but now I think that it might.  An NFSv3
> DTrace provider''s probes would probably fall into a different
pattern
> than v4, so it
> should probably be separated into another provider.
> 
> For now, I''ll add a "4" to the end of the
providers'' names.
Making it the nfs4 provider seems a little awkward. Would it be worth doing
the investigation to see how it might be extended to nfsv3 before finalizing
that naming change?
> >Regarding arguments, I''m not sure that we really want to
expose types
> >such as READ4args which are (I believe) currently just implementation
> >details. Would it make sense to invent a new structure or just  
> >enumerate
> >the arguments to the operation in args[1..N]?
> 
> One reason I favor a structure is that it looks very difficult to me  
> for a DTrace
> provider to give more than five arguments.  So OPEN4args, which has six
> arguments, would be tough to break into args[1..6].
It''s a little tricky, but not actually that difficult to have more than
5
arguments. If you need a hand, feel free to ask. First you need to figure
you which interface makes the most sense -- structures or enumerated
argument lists.
> Structures such as READ4args are defined by RFC 3530 (the document that
> describes NFSv4).  As implemented by OpenSolaris, there may be some extra
> members in those structures, but someone understanding RFC 3530 could  just
> use the "real" members only.  The extra members of the structures
would
> potentially be useful, but it does raise problems as to their stability.
>
> Of course, translators could easily translate them to the "vanilla  
> RFC 3530" versions of the structures.  Maybe that''s the way
to go.
I didn''t realize those were defined by the standards. In that case I
suggest
you create a translator as you suggest. I''d also like to see prefixes
for
the structure members unless the standard also defines those names.
> I''ve got three requests (so far) for this.  I''ll be
working on it!  :-)
Very cool. I look forward to playing with it.

Adam

-- 
Adam Leventhal, Solaris Kernel Development       http://blogs.sun.com/ahl

Nicolas Williams

2006-Jan-03 20:41 UTC

head link

[dtrace-discuss] DTrace provider for NFS

On Mon, Jan 02, 2006 at 06:07:26PM -0800, Adam Leventhal
wrote:> I''ve always been (and continue to be) a little confused by
compound
> operations. My understanding is that a compound operation is a single
> command that can invoke many operations at one go.                                          ^^^^^^^^^
Yes, though the ops are evaluated in sequence, and there''s what you
might call variables (two: current and saved filehandles) which are
referenced and can be changed by various ops.

So COMPOUND is like a very limited programming language.
>                                                    I can imagine someone
> enabling the nfs::op-create:start probe (or whatever) and drawing the
> wrong conclusions because the create operations were mostly part of
> compound operations.
They can only be part of a COMPOUND.
>                      Assuming I haven''t constructed a completely
farcical
> scenario, would it make sense to have a compound operation fire the
> op-compound probe as well as any probes for operations collected by that
> compound operation?
That''s what I thought this prototype did.  Note that each op has to
fire
in the order it''s processed.

Nico
--

Nicolas Williams

2006-Jan-03 20:49 UTC

head link

[dtrace-discuss] DTrace provider for NFS

On Tue, Jan 03, 2006 at 12:23:51PM -0800, Adam Leventhal
wrote:> On Tue, Jan 03, 2006 at 01:17:32PM -0700, Sam Falkner wrote:
> > For now, I''ll add a "4" to the end of the
providers'' names.
> 
> Making it the nfs4 provider seems a little awkward. Would it be worth doing
> the investigation to see how it might be extended to nfsv3 before
finalizing
> that naming change?
The protocols are quite different though.

An NFSv3 server that translates NFSv3 RPCs to NFSv4 compounds might be
feasible, but it''s not how Solaris does it.  The reverse is not
feasible.  So there''s not likely to be any similarity between an NFSv4
and an NFSv2/3 DTrace providers.

NFSv3 and NFSv2, at least, are fairly similar to each other.

Now, the underlying VOPs done by the server to satisfy a client request
could be traced separately.  Perhaps we ought to have a VFS provider.

Hmmmm!

How about having the NFS provider have at least one generic probe pair
that fires whenever any RPC is received/replied?  Between that and a VFS
provider one could then come up with useful DTrace scripts that work
with all NFS versions.

Nico
--

Brendan Gregg

2006-Jan-06 09:11 UTC

head link

[dtrace-discuss] DTrace provider for NFS

G''Day Sam,

On Mon, 2 Jan 2006, Sam Falkner wrote:
> FYI, I posted a blog a few days ago about a DTrace provider for NFS
> that is currently in
> development:
>
> http://blogs.sun.com/roller/page/samf?entry=a_dtrace_provider_for_nfs
>
> Let''s discuss any questions, comments, etc. here.  I also
advertised
> this on
> nfs-discuss at opensolaris.org.  Naturally, I would expect the
> discussion here to
> be more on the specifics of DTrace, and the discussion on nfs-discuss
> to be
> more about NFS.  Feel free to join either or both discussions.
This is good news. :-)

At the moment it''s easy to fetch NFS client I/O activity from the io
provider, but I''d like to trace server activity as well. I wrote a
script
called nfswizard.d in the DTraceToolkit to do something useful with
io:nfs:: (I should rename it nfsclientwizard.d), it''s output is,

---
   # nfswizard.d
   Sampling... Hit Ctrl-C to end.
   ^C
   NFS Client Wizard. 2005 Dec  2 14:59:07 -> 2005 Dec  2 14:59:14

   Read:  4591616 bytes (4 Mb)
   Write: 0 bytes (0 Mb)

   Read:  640 Kb/sec
   Write: 0 Kb/sec

   NFS I/O events:    166
   Avg response time: 8 ms
   Max response time: 14 ms

   Response times (us):
              value  ------------- Distribution ------------- count
                128 |                                         0
                256 |                                         1
                512 |@@@                                      14
               1024 |@                                        4
               2048 |@@@@@@@                                  30
               4096 |@@@@@                                    20
               8192 |@@@@@@@@@@@@@@@@@@@@@@@                  97
              16384 |                                         0

   Top 25 files accessed (bytes):
      PATHNAME                                                         BYTES
      /net/mars/var/tmp/adm/vold.log                                   4096
      /net/mars/var/tmp/adm/uptime                                     4096
      /net/mars/var/tmp/adm/mail                                       4096
      /net/mars/var/tmp/adm/authlog.5                                  4096
      /net/mars/var/tmp/adm/ftpd                                       12288
      /net/mars/var/tmp/adm/spellhist                                  16384
      /net/mars/var/tmp/adm/messages                                   16384
      /net/mars/var/tmp/adm/utmpx                                      20480
      /net/mars/var/tmp/adm/ftpd.2                                     20480
      /net/mars/var/tmp/adm/ftpd.3                                     20480
      /net/mars/var/tmp/adm/ftpd.1                                     24576
      /net/mars/var/tmp/adm/ftpd.0                                     24576
      /net/mars/var/tmp/adm/lastlog                                    28672
      /net/mars/var/tmp/adm/ipf                                        61440
      /net/mars/var/tmp/adm/loginlog                                   69632
      /net/mars/var/tmp/adm/ipf.4                                      73728
      /net/mars/var/tmp/adm/messages.20040906                          81920
      /net/mars/var/tmp/adm/ipf.3                                      102400
      /net/mars/var/tmp/adm/ipf.1                                      110592
      /net/mars/var/tmp/adm/ipf.5                                      114688
      /net/mars/var/tmp/adm/ipf.2                                      114688
      /net/mars/var/tmp/adm/ipf.0                                      122880
      /net/mars/var/tmp/adm/route.log                                  266240
      /net/mars/var/tmp/adm/pppd.log                                   425984
      /net/mars/var/tmp/adm/wtmpx                                      2842624
---

You may find the details I choose to examine interesting. Anyway, this
sort of information would be great (and much more useful) from a server
perspective.

cheers,

Brendan

[Sydney, Australia]

David Collier-Brown

2006-Jan-06 14:21 UTC

head link

[dtrace-discuss] DTrace provider for NFS

A tiny niggle...  If you have the option, capture the latency
and RTT as well.

   The latency is the measure of how much time passes before
the first byte arrives, and consists of
   - one round-trip-time (RTT),
   - the time the request sits in queue before it''s serviced, and
   - the time the program has to spend thinking before it
     has anything to send.

   If you have them separate, you can do analysis on the
response time without queuing and without the RTT, and predict
   - the response time under increasing load (and queuing)
   - the response time with a longer or slower network

   It also helps when computing throughput in bytes per second
(instead of in transaction psr second), as it''s usually better
to do bytes/transfer_time than bytes/RT, where transfer time
is RT - latency. The latter number can be compared directly
to the network throughput, to see what percentage we''re using up.

   I''d just list the averages under the existing ones, and leave the
graph the same:

    Read:  4591616 bytes (4 Mb)
    Write: 0 bytes (0 Mb)

    Read:  640 Kb/sec -- this might change
    Write: 0 Kb/sec


    NFS I/O events:    16
    Avg response time: 8 ms
    Max response time: 14 ms
    Avg latency:       1 ms
    Avg RTT:           1 ms

--dave (who''s writing a paper on this in his Copious Spare Time) c-b


Brendan Gregg wrote:> G''Day Sam,
> 
> On Mon, 2 Jan 2006, Sam Falkner wrote:
> 
> 
>>FYI, I posted a blog a few days ago about a DTrace provider for NFS
>>that is currently in
>>development:
>>
>>http://blogs.sun.com/roller/page/samf?entry=a_dtrace_provider_for_nfs
>>
>>Let''s discuss any questions, comments, etc. here.  I also
advertised
>>this on
>>nfs-discuss at opensolaris.org.  Naturally, I would expect the
>>discussion here to
>>be more on the specifics of DTrace, and the discussion on nfs-discuss
>>to be
>>more about NFS.  Feel free to join either or both discussions.
> 
> 
> This is good news. :-)
> 
> At the moment it''s easy to fetch NFS client I/O activity from the
io
> provider, but I''d like to trace server activity as well. I wrote a
script
> called nfswizard.d in the DTraceToolkit to do something useful with
> io:nfs:: (I should rename it nfsclientwizard.d), it''s output is,
> 
> ---
>    # nfswizard.d
>    Sampling... Hit Ctrl-C to end.
>    ^C
>    NFS Client Wizard. 2005 Dec  2 14:59:07 -> 2005 Dec  2 14:59:14
> 
>    Read:  4591616 bytes (4 Mb)
>    Write: 0 bytes (0 Mb)
> 
>    Read:  640 Kb/sec
>    Write: 0 Kb/sec
> 
>    NFS I/O events:    166
>    Avg response time: 8 ms
>    Max response time: 14 ms
> 
>    Response times (us):
>               value  ------------- Distribution ------------- count
>                 128 |                                         0
>                 256 |                                         1
>                 512 |@@@                                      14
>                1024 |@                                        4
>                2048 |@@@@@@@                                  30
>                4096 |@@@@@                                    20
>                8192 |@@@@@@@@@@@@@@@@@@@@@@@                  97
>               16384 |                                         0
> 
>    Top 25 files accessed (bytes):
>       PATHNAME                                                        
BYTES
>       /net/mars/var/tmp/adm/vold.log                                   4096
>       /net/mars/var/tmp/adm/uptime                                     4096
>       /net/mars/var/tmp/adm/mail                                       4096
>       /net/mars/var/tmp/adm/authlog.5                                  4096
>       /net/mars/var/tmp/adm/ftpd                                      
12288
>       /net/mars/var/tmp/adm/spellhist                                 
16384
>       /net/mars/var/tmp/adm/messages                                  
16384
>       /net/mars/var/tmp/adm/utmpx                                     
20480
>       /net/mars/var/tmp/adm/ftpd.2                                    
20480
>       /net/mars/var/tmp/adm/ftpd.3                                    
20480
>       /net/mars/var/tmp/adm/ftpd.1                                    
24576
>       /net/mars/var/tmp/adm/ftpd.0                                    
24576
>       /net/mars/var/tmp/adm/lastlog                                   
28672
>       /net/mars/var/tmp/adm/ipf                                       
61440
>       /net/mars/var/tmp/adm/loginlog                                  
69632
>       /net/mars/var/tmp/adm/ipf.4                                     
73728
>       /net/mars/var/tmp/adm/messages.20040906                         
81920
>       /net/mars/var/tmp/adm/ipf.3                                     
102400
>       /net/mars/var/tmp/adm/ipf.1                                     
110592
>       /net/mars/var/tmp/adm/ipf.5                                     
114688
>       /net/mars/var/tmp/adm/ipf.2                                     
114688
>       /net/mars/var/tmp/adm/ipf.0                                     
122880
>       /net/mars/var/tmp/adm/route.log                                 
266240
>       /net/mars/var/tmp/adm/pppd.log                                  
425984
>       /net/mars/var/tmp/adm/wtmpx                                     
2842624
> ---
> 
> You may find the details I choose to examine interesting. Anyway, this
> sort of information would be great (and much more useful) from a server
> perspective.
> 
> cheers,
> 
> Brendan
> 
> [Sydney, Australia]
> 
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
> 
-- 
David Collier-Brown,      | Always do right. This will gratify
Sun Microsystems, Toronto | some people and astonish the rest
davecb at canada.sun.com     |                      -- Mark Twain
(416) 263-5733 (x65733)   |

Brendan Gregg

2006-Jan-08 08:51 UTC

head link

[dtrace-discuss] DTrace provider for NFS

G''Day Dave,

On Fri, 6 Jan 2006, David Collier-Brown wrote:
>    A tiny niggle...  If you have the option, capture the latency
> and RTT as well.
Please niggle away. :)
>    The latency is the measure of how much time passes before
> the first byte arrives, and consists of
>    - one round-trip-time (RTT),
>    - the time the request sits in queue before it''s serviced, and
>    - the time the program has to spend thinking before it
>      has anything to send.
This makes sense to me. So, rather than measuring the specific overheads
when initialising an NFS request (which could get out of hand), we
calculate latency as a useful and close enough estimate.

Starting with the obvious, I could use,

   time(latency) = time(1st nfs_read:return) - time(nfs_open:entry)

   time(RTT) = time(nfs_read:return) - time(nfs_read:entry)

so that latency includes work performed from the nfs_open to the first
byte received, and RTT is a read time. RTT could be calculated as an
average for all RTTs measured.

But. nfs_read can return straight from the cache, undercounting latency
and RTT. I could look at nfs_bio to guarentee a network event - measuring
nfs_open:entry to the 1st nfs_bio:return. This would be better, but then
could overcount latency if there were several cached reads before the
first nfs_bio. Hmm.

Being a bit creative,

   time(latency) = time(1st nfs_read:entry) - time(nfs_open:entry) +
                   time(1st nfs_bio:return) - time(1st nfs_bio:entry)

   time(RTT) = time(nfs_bio:return) - time(nfs_bio:entry)

So that latency includes the nfs_open to 1st event overhead, plus the
first RTT. And now to include NFS writes,

   time(latency) = MIN(time(1st nfs_read:entry),
                       time(1st nfs_write:entry)) - time(nfs_open:entry)
                   + time(1st nfs_bio:return) - time(1st nfs_bio:entry)

... I could measure RTT closer to the network driver by using
io:nfs::done and io:nfs::start. I should also include other ops apart from
read/write.
>    If you have them separate, you can do analysis on the
> response time without queuing and without the RTT, and predict
>    - the response time under increasing load (and queuing)
>    - the response time with a longer or slower network
>
>    It also helps when computing throughput in bytes per second
> (instead of in transaction psr second), as it''s usually better
> to do bytes/transfer_time than bytes/RT, where transfer time
> is RT - latency.
I understand this point - so that we know what the network interface has
been asked to do. No big deal, but if latency includes one RTT, then
wouldn''t transfer time be RT - latency + 1 RTT?
> The latter number can be compared directly
> to the network throughput, to see what percentage we''re using up.
... To give an estimate for the percentage consumed - this wouldn''t
take account of TCP/IP/Ethernet headers, TCP retransmits, etc. Anyway,
using bytes/transfer_time may be fine for understanding maximum
throughput and predicting response times - but I''m not sure a
throughput percentage is that meaningful in terms of overall network
utilisation (100% utilised may be fine, for short bursts).
>    I''d just list the averages under the existing ones, and leave
the
> graph the same:
>
>     Read:  4591616 bytes (4 Mb)
>     Write: 0 bytes (0 Mb)
>
>     Read:  640 Kb/sec -- this might change
This is total read Kb per second for that sample. It would certainly
change if I''m reporting maximum Kb/sec on the network interface, or
some other statistic based on the transfer time. This isn''t based on
transfer times.
>     Write: 0 Kb/sec
>
>
>     NFS I/O events:    16
>     Avg response time: 8 ms
>     Max response time: 14 ms
>     Avg latency:       1 ms
>     Avg RTT:           1 ms
Ok, looks good to me.
>
> --dave (who''s writing a paper on this in his Copious Spare Time)
c-b
>
>
The following is the output of a work in progress tool to experiment with
these statistics. I only have fbt::: to play with - good thing the nfs
code is well written. :)


# ./nfsrtt.d
[...]
opened /net/mars/var/tmp/creatbyproc_example.txt
        latency 9 ms
        RTT 8 ms
opened /net/mars/var/tmp/crypt_3rot13.c
        latency 1 ms
        RTT 1 ms
opened /net/mars/var/tmp/cstyle
        latency 13 ms
        RTT 13 ms
opened /net/mars/var/tmp/cswstat.d
        latency 11 ms
        RTT 11 ms
^C
Latency (ns):

           value  ------------- Distribution ------------- count
          524288 |                                         0
         1048576 |@                                        1
         2097152 |@@                                       2
         4194304 |@@@@@@@@@                                10
         8388608 |@@@@@@@@@@@@@@@@@@                       19
        16777216 |@@@@@@@@@                                10
        33554432 |@                                        1
        67108864 |                                         0

RTT (ns):

           value  ------------- Distribution ------------- count
          262144 |                                         0
          524288 |@@@                                      28
         1048576 |@@@@@@@@@@                               81
         2097152 |@@@@@@@@@                                74
         4194304 |@@@@@@@@@@@@                             96
         8388608 |@@@                                      23
        16777216 |@@@                                      23
        33554432 |                                         2
        67108864 |                                         0


The script is attached, which is for NFSv2,3,4. Don''t take it too
seriously yet - I only just started work on this a couple of hours ago.
It''s a stateful monster with many tentacles. And remember I''m
still just
looking at client activity from the client server itself.

Next step is to see where else Latency and RTT are useful. by
process/file/filesystem? predicted max transactions = 1 sec / latency?
predicted max throughput = packet size * (1 sec / RTT)? ... I should also
print more statistics on packet size.

...

thanks for your email,

Brendan

[Sydney, Australia]
-------------- next part --------------
#!/usr/sbin/dtrace -s
/* nfsrtt.d - NFS RTT and Latency statistics. Work in progress.
 *
 * I''ve keyed on the vnode addr, and assumed that transactions are
processed
 * in the same thread. I need to check whether both of these were wise - 
 * I''d guess that this needs to be reworked to work correctly on a
multi-CPU
 * server.
 */

#pragma D option quiet

inline int DEBUG = 1;

dtrace:::BEGIN
{
	trace("Sampling...\n");
}

fbt:nfs:nfs_open:entry,
fbt:nfs:nfs3_open:entry,
fbt:nfs:nfs4_open:entry
{
	self->opened[(uint64_t)*args[0]] = timestamp;
	self->firstread[(uint64_t)*args[0]] = 1;
	DEBUG ? printf("opened %s\n",
	    stringof(((struct vnode *)*args[0])->v_path)) : 1;
}

fbt:nfs:nfs_read:entry, fbt:nfs:nfs_write:entry,
fbt:nfs:nfs3_read:entry, fbt:nfs:nfs3_write:entry,
fbt:nfs:nfs4_read:entry, fbt:nfs:nfs4_write:entry
/self->firstread[arg0] && self->opened[arg0]/
{
	self->initial[arg0] = timestamp - self->opened[arg0];
	self->firstread[arg0] = 0;
}

fbt:nfs:nfs_bio:entry,
fbt:nfs:nfs3_bio:entry,
fbt:nfs:nfs4_bio:entry
/self->opened[(uint64_t)args[0]->b_vp]/
{
	self->rttstart[(uint64_t)args[0]->b_vp] = timestamp;
	self->vn = (uint64_t)args[0]->b_vp;
}

fbt:nfs:nfs_bio:return,
fbt:nfs:nfs3_bio:return,
fbt:nfs:nfs4_bio:return
/self->initial[self->vn]/
{
	this->latency = self->initial[self->vn] + timestamp -
	    self->rttstart[self->vn];
	@Latency = quantize(this->latency);
	self->initial[self->vn] = 0;
	DEBUG ? printf("\tlatency %d ms\n", this->latency / 1000000) : 1;
}

fbt:nfs:nfs_bio:return,
fbt:nfs:nfs3_bio:return,
fbt:nfs:nfs4_bio:return
/self->rttstart[self->vn]/
{
	this->rtt = timestamp - self->rttstart[self->vn];
	@RTT = quantize(this->rtt);
	self->rttstart[self->vn] = 0;
	self->vn = 0;
	DEBUG ? printf("\tRTT %d ms\n", this->rtt / 1000000) : 1;
}

fbt:nfs:nfs_close:entry,
fbt:nfs:nfs3_close:entry,
fbt:nfs:nfs4_close:entry
{
	self->opened[arg0] = 0;
}

dtrace:::END
{
	printf("Latency (ns):");
	printa(@Latency);

	printf("RTT (ns):");
	printa(@RTT);
}

ttoulliu2002

2006-Jan-09 14:45 UTC

head link

[dtrace-discuss] Re: DTrace provider for NFS

Sam:

I have been practicing DTrace, one of a few things to share
with you, pointers are very appreciated ! 

After I have worked on different platforms with different kernel
modules, some times even with grid systems and resource virtualization
such as processors, zone, file systems etc.  I have realized that 
my D code stability in terms of solaris engineering or portability
and reusability in terms of software engineering is where
I need to spend time on.

(1) different versions of kernel modules such as files systems
     and different versions of in kernel function calls

(2) different platform architecture design and implementation such
     as cpu kernel structures etc.  I felt lucky to deal with VM and
     VFS since they do the abstraction for me. However, CPU is 
     one of the exception

(3) normal host based instrumentation vs virtualization environment
     such as zone specific.


This may not only for NFS, but NFS is one of the use cases to I need
to continue to put my thoughts on.  However, It may not always
introduce more kernel abstraction and encapsulation and some work
still can be done at D code level so that my D code can be more
extendable to the new platforms, kernel modules and kernel functions
as for the different versions of  kernel modules and functions, they
are really the challenges for my D code since it means I need to 
keep my D code to sync with the life cycle of your module and 
functions releases.  pointers are very appreciated !

Thanks
This message posted from opensolaris.org

Sam Falkner

2006-Jan-15 17:06 UTC

head link

[dtrace-discuss] Re: DTrace provider for NFS

I''m not sure I understand everything that you''re saying, but I
will
try to summarize the thoughts that come to mind when reading your email.

A DTrace provider for NFS doesn''t really give you much that you  
couldn''t do with the standard fbt provider.  But what it does give  
you is a layer of abstraction and stability over using fbt.

When I first started thinking about a provider, it was because I was  
a bit frustrated when trying to debug a somewhat complex problem.  I  
was very tired, and didn''t feel like discovering all of the paths  
through the functions, and the arguments to the functions, that could  
lead to the event I was trying to trace.

A DTrace provider gives you much in the way of independence from the  
current implementation of the OpenSolaris NFS code.  You can write a  
script to the NFSv4 protocol, and not care about which functions are  
being called to implement the protocol.  If there are massive changes  
in the NFSv4 client code between Solaris 10 and the next release, a  
script written exclusively to the NFS provider won''t be broken.   
Thus, your script will be more stable.  I hope that the NFS providers  
and other providers help to give your scripts the stability that you  
need.

- Sam

On Jan 9, 2006, at 7:45 AM, ttoulliu2002 wrote:
> Sam:
>
> I have been practicing DTrace, one of a few things to share
> with you, pointers are very appreciated !
>
> After I have worked on different platforms with different kernel
> modules, some times even with grid systems and resource virtualization
> such as processors, zone, file systems etc.  I have realized that
> my D code stability in terms of solaris engineering or portability
> and reusability in terms of software engineering is where
> I need to spend time on.
>
> (1) different versions of kernel modules such as files systems
>      and different versions of in kernel function calls
>
> (2) different platform architecture design and implementation such
>      as cpu kernel structures etc.  I felt lucky to deal with VM and
>      VFS since they do the abstraction for me. However, CPU is
>      one of the exception
>
> (3) normal host based instrumentation vs virtualization environment
>      such as zone specific.
>
>
> This may not only for NFS, but NFS is one of the use cases to I need
> to continue to put my thoughts on.  However, It may not always
> introduce more kernel abstraction and encapsulation and some work
> still can be done at D code level so that my D code can be more
> extendable to the new platforms, kernel modules and kernel functions
> as for the different versions of  kernel modules and functions, they
> are really the challenges for my D code since it means I need to
> keep my D code to sync with the life cycle of your module and
> functions releases.  pointers are very appreciated !
>
> Thanks
> This message posted from opensolaris.org
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org

debabrata das

2009-Dec-03 18:24 UTC

head link

[dtrace-discuss] Trapping nfs client calls again particular mounted filesystem

Hi Sam,

As a part of I/O tuning process, I am trying to capture NFS different client
call agains one of our many mounted file systems. We are using nfs ver 3. Could
you please let me know the script fo doing this ? I am not expert in dtrace. I
would like to know tiems spent in seconds and also no. of calls.

Thanks
Deba
-- 
This message posted from opensolaris.org

Apparently Analagous Threads

Search for more maybe matching threads

dtrace discuss - Jan 2006 - DTrace provider for NFS

[dtrace-discuss] DTrace provider for NFS

[dtrace-discuss] DTrace provider for NFS

[dtrace-discuss] DTrace provider for NFS

[dtrace-discuss] DTrace provider for NFS

[dtrace-discuss] DTrace provider for NFS

[dtrace-discuss] DTrace provider for NFS

[dtrace-discuss] DTrace provider for NFS

[dtrace-discuss] DTrace provider for NFS

[dtrace-discuss] DTrace provider for NFS

[dtrace-discuss] DTrace provider for NFS

[dtrace-discuss] DTrace provider for NFS

[dtrace-discuss] DTrace provider for NFS

[dtrace-discuss] DTrace provider for NFS

[dtrace-discuss] DTrace provider for NFS

[dtrace-discuss] Re: DTrace provider for NFS

[dtrace-discuss] Re: DTrace provider for NFS

[dtrace-discuss] Trapping nfs client calls again particular mounted filesystem

Apparently Analagous Threads