thr3ads.net - crossbow discuss - [crossbow-discuss] inception review summary of PSARC/2009/364

If this information is useful, please help other people find it:
Share via:

Shrikrishna Khare

2009-Jun-24 22:15 UTC

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Moving this discussion to crossbow-discuss:

Alternatively, we can provide an interface to let user ''mark''
a
particular instance in time. Later, one can retrieve statistics counters 
since last ''marking'' (This borrows from Nicolas''s
suggestion during
PSARC meeting today). To elaborate:

- At time instant A, save current counter values; ''-n'' for
"take
snapshot now"; this command returns nothing: dlstat -n
- At later time B, display cumulative received side statistics: dlstat -r
- At later time C, display statistics during time (A, B); ''d''
for delta:
dlstat -r -d
- Future "dlstat -n" will overwrite the previous snapshot counters
with
then current values.

In other words, user interested in resetting statistics would simply 
need to run dlstat -n once and use dlstat commands with additional -d 
there onwards.

Since the kernel counters themselves will never be reset, kstat will 
continue to return cumulative statistics.

~ Shri


Kais Belgaied wrote:> Thanks Jim (and good to hear from you). I captured this issue as 
> jdc-01 in the issues file,
> and Shri indicated off-line that he will answer and follow up on 
> crossbow-discuss at opensolaris.org.
>
>    Kais.
>
> On 06/24/09 12:09, James Carlson wrote:
>> Kais Belgaied writes:
>>  
>>> - Multiple issues with reset stats:
>>>   - The loss of information accumulated since boot time may hurt
the
>>>     diagnosability of problem with flows and links.
>>>   - The action needs to be privileged.
>>>   - suggestion: move the reset subcommand to {dl,flow}adm, with 
>>> expected
>>>     effect of resetting the state of the datalink/flow, which 
>>> includes the
>>>     usage statistics thereof.
>>>     
>>
>> I suggest getting rid of "reset" altogether.  Besides being
>> fundamentally incompatible with SNMP instrumentation, "reset"
just
>> isn''t necessary or complete as long as you have decent delta-
>> calculating tools.  (And I''d really rather have a good way of
>> computing deltas -- especially as a non-privileged user -- than having
>> an inaccessible way to nuke kernel counters.)
>>
>>   
>
> _______________________________________________
> opensolaris-arc mailing list
> opensolaris-arc at opensolaris.org

Garrett D''Amore

2009-Jun-24 23:01 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

In my opinion, this violates KISS.  We don''t go through these gyrations
for any other statistic reporting facility, and I don''t see why it is 
warranted here.

An end user can simply do two dlstat calls and perform subtraction.  It 
only takes a few lines of shell code, right?

count1=`dlstat -o value <options>''
generate_some_long_running_load
count2=`dlstat -o value <options>`
echo "Total value is " `expr $count2 - $count1`

The reality is that most users of the facility won''t even go to that 
trouble.

If the user is saavy enough to store a snapshot name somewhere, he can 
probably also store a numeric value.  (Perhaps by saving the dlstat 
output to a scratch file.)

Let''s not overengineer things here.  (Yeah, I know Sun engineers like
to
overengineer things... but I still try to buck the trend.)

    - Garrett

Shrikrishna Khare wrote:>
> Moving this discussion to crossbow-discuss:
>
> Alternatively, we can provide an interface to let user
''mark'' a
> particular instance in time. Later, one can retrieve statistics 
> counters since last ''marking'' (This borrows from
Nicolas''s suggestion
> during PSARC meeting today). To elaborate:
>
> - At time instant A, save current counter values; ''-n''
for "take
> snapshot now"; this command returns nothing: dlstat -n
> - At later time B, display cumulative received side statistics: dlstat -r
> - At later time C, display statistics during time (A, B);
''d'' for
> delta: dlstat -r -d
> - Future "dlstat -n" will overwrite the previous snapshot
counters
> with then current values.
>
> In other words, user interested in resetting statistics would simply 
> need to run dlstat -n once and use dlstat commands with additional -d 
> there onwards.
>
> Since the kernel counters themselves will never be reset, kstat will 
> continue to return cumulative statistics.
>
> ~ Shri
>
>
> Kais Belgaied wrote:
>> Thanks Jim (and good to hear from you). I captured this issue as 
>> jdc-01 in the issues file,
>> and Shri indicated off-line that he will answer and follow up on 
>> crossbow-discuss at opensolaris.org.
>>
>>    Kais.
>>
>> On 06/24/09 12:09, James Carlson wrote:
>>> Kais Belgaied writes:
>>>  
>>>> - Multiple issues with reset stats:
>>>>   - The loss of information accumulated since boot time may
hurt the
>>>>     diagnosability of problem with flows and links.
>>>>   - The action needs to be privileged.
>>>>   - suggestion: move the reset subcommand to {dl,flow}adm, with
>>>> expected
>>>>     effect of resetting the state of the datalink/flow, which 
>>>> includes the
>>>>     usage statistics thereof.
>>>>     
>>>
>>> I suggest getting rid of "reset" altogether.  Besides
being
>>> fundamentally incompatible with SNMP instrumentation,
"reset" just
>>> isn''t necessary or complete as long as you have decent
delta-
>>> calculating tools.  (And I''d really rather have a good way
of
>>> computing deltas -- especially as a non-privileged user -- than
having
>>> an inaccessible way to nuke kernel counters.)
>>>
>>>   
>>
>> _______________________________________________
>> opensolaris-arc mailing list
>> opensolaris-arc at opensolaris.org
>
> _______________________________________________
> crossbow-discuss mailing list
> crossbow-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss

James Carlson

2009-Jun-24 23:07 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Shrikrishna Khare writes:> In other words, user interested in resetting statistics would simply 
> need to run dlstat -n once and use dlstat commands with additional -d 
> there onwards.
Getting better ... but where does "-n" store its information?  Is
there only one global repository of snapshotted data?

What I''d really like to have (as non-privileged user) is a way to
snapshot the counters at two points in time, and then compute the
differences.  If the dlstat command helped me do that, then that''d be
great.  One possibility would be:

	dlstat snap mysnapshot1
	...
	dlstat snap mysnapshot2
	dlstat delta mysnapshot1 mysnapshot2

... where the "snap" subcommand takes a snapshot and saves it in the
indicated file (in an arbitrary format; binary would be fine).  The
"delta" command would then display the statistics in some
human-friendly form.
> Since the kernel counters themselves will never be reset, kstat will 
> continue to return cumulative statistics.
If it''s saved between commands using a global snapshot, then
it''s
almost the same thing as clearing the kernel statistics, at least as
far as privilege goes.  (Though it does fix the problem with SNMP.)

-- 
James Carlson         42.703N 71.076W         <carlsonj at
workingcode.com>

James Carlson

2009-Jun-24 23:24 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Garrett D''Amore writes:> If the user is saavy enough to store a snapshot name somewhere, he can 
> probably also store a numeric value.  (Perhaps by saving the dlstat 
> output to a scratch file.)
I''d be happy with that as well.  I''m only unhappy with the
idea of
clearing statistics.  We used to have that in netstat, and we got rid
of it on purpose.

-- 
James Carlson         42.703N 71.076W         <carlsonj at
workingcode.com>

Garrett D''Amore

2009-Jun-25 01:21 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

James Carlson wrote:> Garrett D''Amore writes:
>   
>> If the user is saavy enough to store a snapshot name somewhere, he can 
>> probably also store a numeric value.  (Perhaps by saving the dlstat 
>> output to a scratch file.)
>>     
>
> I''d be happy with that as well.  I''m only unhappy with
the idea of
> clearing statistics.  We used to have that in netstat, and we got rid
> of it on purpose.
>
>   
Yes, I agree that clearing kstats is a bad idea.  I really believe we 
should go for the simplest solution here.  One doesn''t need to design a
suspension truss capable of holding five tons as when all one needs is a 
kitchen table.

    - Garrett

Darren Reed

2009-Jun-25 17:20 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

James Carlson wrote:> Garrett D''Amore writes:
>> If the user is saavy enough to store a snapshot name somewhere, he can 
>> probably also store a numeric value.  (Perhaps by saving the dlstat 
>> output to a scratch file.)
>
> I''d be happy with that as well.  I''m only unhappy with
the idea of
> clearing statistics.  We used to have that in netstat, and we got rid
> of it on purpose.
To wander off on a related tangent...

Sometimes clearing statistics can be useful...

When I''ve had an Internet connection via an ISP that
meters traffic (you have a ''quota'' of x gb/month),
I''ve run a cron job on the end-of-month day to dump
out the output from ipfstat and clear the counters
back to 0. Thus at any time during the month, using
ipfstat gives me a reasonable approximation of the
current quota use.

Not being able to reset the number back to 0 would
require storing the "snapshot" data somewhere and thus
create a much more complex task of reviewing Internet
use. There is a slight problem there of reboot
interruptions, but they don''t happen all that often ;)

The act of reseting stats back to 0 also provides the
user with the current value of said statistics.

Darren

Garrett D''Amore

2009-Jun-25 22:56 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

This is not a bad approach.  I''m still not entirely convinced its worth
the effort, but I can still see it being useful.

    -- Garrett


Shrikrishna Khare wrote:>
> Here is an alternative (from the discussion Nicolas and I had a while 
> ago).
>
> 1. dlstat -k -p > snap1.txt               # Dump all the statistics in 
> parsable format, output redirected to snap1.txt. This option is part 
> of the originally proposed man page.
> 2. dlstat -r -h                           # Display receive side per 
> hardware lane statistics.
> 3. dlstat -r -h -d snap1.txt              # Diff "current receive side
> per hardware lane stats" with the stats saved in snap1.txt and
display.
> 4. dlstat -k -p > snap2.txt               # Dump all the statistics in 
> parsable format, output redirected to snap2.txt.
> 5. dlstat -t -h -d snap1.txt -D snap2.txt # Diff "transmit side per 
> hardware lane stats" saved in snap2.txt with the stats saved in 
> snap1.txt and display.
>
> This approach not only avoids resetting statistics but also provides 
> an easy to use interface for diff calculation.
>
> ~ Shri
>
>
> Darren Reed wrote:
>> James Carlson wrote:
>>> Garrett D''Amore writes:
>>>> If the user is saavy enough to store a snapshot name somewhere,
he
>>>> can probably also store a numeric value.  (Perhaps by saving
the
>>>> dlstat output to a scratch file.)
>>>
>>> I''d be happy with that as well.  I''m only unhappy
with the idea of
>>> clearing statistics.  We used to have that in netstat, and we got
rid
>>> of it on purpose.
>>
>> To wander off on a related tangent...
>>
>> Sometimes clearing statistics can be useful...
>>
>>
>> When I''ve had an Internet connection via an ISP that
>> meters traffic (you have a ''quota'' of x gb/month),
>> I''ve run a cron job on the end-of-month day to dump
>> out the output from ipfstat and clear the counters
>> back to 0. Thus at any time during the month, using
>> ipfstat gives me a reasonable approximation of the
>> current quota use.
>>
>> Not being able to reset the number back to 0 would
>> require storing the "snapshot" data somewhere and thus
>> create a much more complex task of reviewing Internet
>> use. There is a slight problem there of reboot
>> interruptions, but they don''t happen all that often ;)
>>
>> The act of reseting stats back to 0 also provides the
>> user with the current value of said statistics.
>>
>> Darren
>>
>> _______________________________________________
>> crossbow-discuss mailing list
>> crossbow-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
>

Shrikrishna Khare

2009-Jun-25 22:57 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Here is an alternative (from the discussion Nicolas and I had a while ago).

1. dlstat -k -p > snap1.txt               # Dump all the statistics in 
parsable format, output redirected to snap1.txt. This option is part of 
the originally proposed man page.
2. dlstat -r -h                           # Display receive side per 
hardware lane statistics.
3. dlstat -r -h -d snap1.txt              # Diff "current receive side 
per hardware lane stats" with the stats saved in snap1.txt and display.
4. dlstat -k -p > snap2.txt               # Dump all the statistics in 
parsable format, output redirected to snap2.txt.
5. dlstat -t -h -d snap1.txt -D snap2.txt # Diff "transmit side per 
hardware lane stats" saved in snap2.txt with the stats saved in 
snap1.txt and display.

This approach not only avoids resetting statistics but also provides an 
easy to use interface for diff calculation.

~ Shri

Darren Reed wrote:> James Carlson wrote:
>> Garrett D''Amore writes:
>>> If the user is saavy enough to store a snapshot name somewhere, he 
>>> can probably also store a numeric value.  (Perhaps by saving the 
>>> dlstat output to a scratch file.)
>>
>> I''d be happy with that as well.  I''m only unhappy
with the idea of
>> clearing statistics.  We used to have that in netstat, and we got rid
>> of it on purpose.
>
> To wander off on a related tangent...
>
> Sometimes clearing statistics can be useful...
>
>
> When I''ve had an Internet connection via an ISP that
> meters traffic (you have a ''quota'' of x gb/month),
> I''ve run a cron job on the end-of-month day to dump
> out the output from ipfstat and clear the counters
> back to 0. Thus at any time during the month, using
> ipfstat gives me a reasonable approximation of the
> current quota use.
>
> Not being able to reset the number back to 0 would
> require storing the "snapshot" data somewhere and thus
> create a much more complex task of reviewing Internet
> use. There is a slight problem there of reboot
> interruptions, but they don''t happen all that often ;)
>
> The act of reseting stats back to 0 also provides the
> user with the current value of said statistics.
>
> Darren
>
> _______________________________________________
> crossbow-discuss mailing list
> crossbow-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss

Shrikrishna Khare

2009-Jun-26 01:18 UTC

head link

[crossbow-discuss] dlstat code changes: early draft

Hi,

    Here is an early webrev with dlstat changes. Please note that I am 
*not* requesting code review. This is still work in progress with 
several known(and unknown!) issues.

Webrev:
    http://cr.opensolaris.org/~shri.k/dlstat_initial_webrev

PSARC case:
    http://arc.opensolaris.org/caselog/PSARC/2009/364/

In this webrev, dlstat queries kernel for the statistics with explicit 
ioctl. That would be changed to a kstat query instead with kstat 
counters introduced for all the statistics in question. Moreover, some 
of the command syntax will change. Refer "issues" in the PSARC case 
noted below for more details.

Some of the sample outputs generated on Intel 10GbE setup with above 
changes:

# Display statistics
# dlstat show ixgbe0
LINK           IPKTS   RBYTES      OPKTS   OBYTES  UTIL
ixgbe0         39.52G  4.19T       702.00K 121.05M 0

# Receive side statistics
# dlstat show ixgbe0 -r
LINK           IPKTS   RBYTES      INTRS   POLLS   HDRPS   SDRPS   
CH<10   CH10-50 CH>50   UTIL
ixgbe0         39.52G  4.19T       5.22G   34.30G  17.29G  0.00    
1.41G   178.41M 155.05M 0

# Transmit side statistics
# dlstat show ixgbe0 -t
LINK           OPKTS   OBYTES  SDRPS   UTIL
ixgbe0         702.00K 121.05M 0.00    0

# Receive side per lane statistics for ixgbe0 (Only 6 Rx rings are 
active on the system)
# dlstat show ixgbe0 -r -h
LINK           IPKTS   RBYTES      INTRS   POLLS   HDRPS   SDRPS   
CH<10   CH10-50 CH>50   UTIL
ixgbe0:0       6.82G   722.60G     862.72M 5.96G   17.29G  0.00    
207.69M 32.17M  26.66M  0
ixgbe0:1       5.24G   554.98G     774.95M 4.46G   0.00    0.00    
221.86M 26.62M  20.66M  0
ixgbe0:2       7.42G   786.73G     880.91M 6.54G   0.00    0.00    
239.35M 30.39M  28.88M  0
ixgbe0:3       5.67G   601.03G     770.46M 4.90G   0.00    0.00    
192.87M 27.24M  23.47M  0
ixgbe0:4       7.47G   792.31G     1.07G   6.41G   0.00    0.00    
324.13M 32.64M  27.85M  0
ixgbe0:5       6.90G   731.19G     866.66M 6.03G   0.00    0.00    
224.19M 29.35M  27.52M  0


~ Shri

Darren Reed

2009-Jun-26 01:34 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Shrikrishna Khare wrote:>
> Here is an alternative (from the discussion Nicolas and I had a while 
> ago).
>
> 1. dlstat -k -p > snap1.txt               # Dump all the statistics in 
> parsable format, output redirected to snap1.txt. This option is part 
> of the originally proposed man page.
> 2. dlstat -r -h                           # Display receive side per 
> hardware lane statistics.
> 3. dlstat -r -h -d snap1.txt              # Diff "current receive side
> per hardware lane stats" with the stats saved in snap1.txt and
display.
> 4. dlstat -k -p > snap2.txt               # Dump all the statistics in 
> parsable format, output redirected to snap2.txt.
> 5. dlstat -t -h -d snap1.txt -D snap2.txt # Diff "transmit side per 
> hardware lane stats" saved in snap2.txt with the stats saved in 
> snap1.txt and display.
>
> This approach not only avoids resetting statistics but also provides 
> an easy to use interface for diff calculation.
Why are you building "diff" capability into dlstat?
Why aren''t the various incarnations of diff suitable for use here?
Isn''t it better to provide output that is friendly to diff(1) use?
My assertion is that if the output is friendly to diff(1), then you
don''t
need to build in diff capability and that the output will also then be
friendly to many other commands.

Darren

Shrikrishna Khare

2009-Jun-26 03:03 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

----- Original Message -----
From: Darren Reed <Darren.Reed at Sun.COM>
Date: Thursday, June 25, 2009 6:35 pm
Subject: Re: [crossbow-discuss] inception review summary of PSARC/2009/364 -
dlstat and flowstat
To: Shrikrishna Khare <Shrikrishna.Khare at Sun.COM>
Cc: James Carlson <carlsonj at workingcode.com>, gdamore at
opensolaris.org, crossbow-discuss at opensolaris.org

> Shrikrishna Khare wrote:
> >
> >Here is an alternative (from the discussion Nicolas and I had a while 
> ago).
> >
> >1. dlstat -k -p > snap1.txt               # Dump all the statistics 
> in parsable format, output redirected to snap1.txt. This option is 
> part of the originally proposed man page.
> >2. dlstat -r -h                           # Display receive side per 
> hardware lane statistics.
> >3. dlstat -r -h -d snap1.txt              # Diff "current receive 
> side per hardware lane stats" with the stats saved in snap1.txt and
display.
> >4. dlstat -k -p > snap2.txt               # Dump all the statistics 
> in parsable format, output redirected to snap2.txt.
> >5. dlstat -t -h -d snap1.txt -D snap2.txt # Diff "transmit side
per
> hardware lane stats" saved in snap2.txt with the stats saved in 
> snap1.txt and display.
> >
> >This approach not only avoids resetting statistics but also provides 
> an easy to use interface for diff calculation.
> 
> Why are you building "diff" capability into dlstat?
> Why aren''t the various incarnations of diff suitable for use here?
> Isn''t it better to provide output that is friendly to diff(1) use?
> My assertion is that if the output is friendly to diff(1), then you
don''t
> need to build in diff capability and that the output will also then be
> friendly to many other commands.
- I did not find diff(1) option that can compute arithmetic difference.

- Moreover, even if there is some diff variant that can do it, consider the use
case as in example above:
  Diff "current receive side per hardware lane stats" with the stats
saved in snap1.txt and display.
 
  It is still cumbersome to do an additional 
        dlstat -k -p > snap3.txt 
        <arithmatic diff> snap1.txt snap3.txt
        and then hunt through the machine parsable format to look for specific
fields of your interest - per hardware lane stats in our example.

  Having a complete snapshot at particular instant (dlstat -k -p <file>
and then having ability to request diff against statistics of our interest (e.g.
dlstat -r -h -d <file>) looks appealing to me in comparison.

  Moreover, you can then do fancier stuff like: 
              dlstat -r -h -d <file> -i 2

  which will report diffs with the file after every 2 seconds. 

  Essentially, the proposed approach is aimed at catering to all
''reset'' stats requirement, without really resetting anything.
         

> 
> Darren
>

Darren Reed

2009-Jun-26 16:11 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Shrikrishna Khare wrote:

...> - I did not find diff(1) option that can compute arithmetic difference.
>
> - Moreover, even if there is some diff variant that can do it, consider the
use case as in example above:
>   Diff "current receive side per hardware lane stats" with the
stats saved in snap1.txt and display.
>  
>   It is still cumbersome to do an additional 
>         dlstat -k -p > snap3.txt 
>         <arithmatic diff> snap1.txt snap3.txt
>         and then hunt through the machine parsable format to look for
specific fields of your interest - per hardware lane stats in our example.
>   
Hunting for specific fields is no harder or easier with
what you are proposing, unless your "dlstat -r -h -d <file>"
somehow reads my mind to know which fields I want.

"Hardware lane" sounds more like a group of statistics,
to me, not specific fields.

But if you gave me a set of output from "dlstat -k -p" and
"dlstat -r -h -p" (assuming -k to be a superset of -h) and
then asked me to find the lines from -k-p that changed in
-r-h-p, I''d probably apply commands line comm, join, diff,
awk, etc. The architecture of Un*x is to provide lots of
small utilities that have a specific purpose that all output
data to stdout/stderr (and take input from files/stdin),
data that can then be used in various ways by various
programs to achieve more.

>   Having a complete snapshot at particular instant (dlstat -k -p
<file> and then having ability to request diff against statistics of our
interest (e.g. dlstat -r -h -d <file>) looks appealing to me in
comparison.
>
>   Moreover, you can then do fancier stuff like: 
>               dlstat -r -h -d <file> -i 2
>
>   which will report diffs with the file after every 2 seconds.  
When using commands such as "vmstat" and "netstat" with an
option like "-i 2", the general goal is to understand the
change in activity for that period of time.

Then again, not all of the statistics displayed with "vmstat 1"
are progressive changes - memory (swap/free) are an example -
but this makes sense: as swap/free drop, you can see the effect
of this on other stats, such as "sr", and friends.

To what purpose do you see printing out the difference in the
value of a statistic, every second, from a known starting point?

I can see the use in "dlstat -r -h -i 2", but to display relative
to a snapshot? why?

>   Essentially, the proposed approach is aimed at catering to all
''reset'' stats requirement, without really resetting anything.
In which case it would be much better to have a completely new
command written that computed arithmetic differences, based on
a particular type of output.

Then that could be uses on the output of flowstat or dlstat or
any other command rather than use an internal feature of one
specific command.

It also reduces the complexity of dlstat whilst providing us
with more chances to reuse the arithmetic difference without
needing to build it into every statistical command.

Darren

James Carlson

2009-Jun-26 16:21 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Darren Reed writes:> When I''ve had an Internet connection via an ISP that
> meters traffic (you have a ''quota'' of x gb/month),
> I''ve run a cron job on the end-of-month day to dump
> out the output from ipfstat and clear the counters
> back to 0. Thus at any time during the month, using
> ipfstat gives me a reasonable approximation of the
> current quota use.
I think it''d be nice if that cron job didn''t have to run with
privileges sufficient to write to the kernel.

I agree that the delta tool doesn''t necessarily have to be part of
dlstat itself.  Perhaps if it were made a common library (along with
the rest of the column-formatting stuff that Brussels cleaned up),
that would address most of the concerns.

-- 
James Carlson         42.703N 71.076W         <carlsonj at
workingcode.com>

Darren Reed

2009-Jun-26 18:18 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

James Carlson wrote:> Darren Reed writes:
>> When I''ve had an Internet connection via an ISP that
>> meters traffic (you have a ''quota'' of x gb/month),
>> I''ve run a cron job on the end-of-month day to dump
>> out the output from ipfstat and clear the counters
>> back to 0. Thus at any time during the month, using
>> ipfstat gives me a reasonable approximation of the
>> current quota use.
>
> I think it''d be nice if that cron job didn''t have to run
with
> privileges sufficient to write to the kernel.
>
> I agree that the delta tool doesn''t necessarily have to be part of
> dlstat itself.  Perhaps if it were made a common library (along with
> the rest of the column-formatting stuff that Brussels cleaned up),
> that would address most of the concerns.
A C library?

To me that seems like making a mountain out of a mole hill...

15 minutes and I came up with a perl script to print out what
has changed between two dumps of netstat output - and that''s
just a very basic script.

How long would it take to write in C? Substantially longer
because you''ve got to go through all sorts of hoops to handle
text input safely. For those that don''t like perl, well, use
python or your other favourite scripting langauge.

Darren

./statdiff netstat st.1 st.2
tcpInAckBytes   308
ipInReceives    6
ipOutRequests   5
tcpOutDataSegs  5
tcpOutDataBytes 308
tcpOutSegs      5
tcpRttUpdate    4
tcpInInorderSegs        4
ipInDelivers    6
tcpInAckSegs    4
tcpInSegs       6
tcpInInorderBytes       208


#!/bin/perl

if ($#ARGV ne 2) {
        print STDERR "Usaage: <format> <file1>
<file2>\n";
        exit(1);
}

if ($ARGV[0] == "netstat") {
        %fdata1 = ();
        open(F, "<$ARGV[1]");
        &netstatfile(\%fdata1);
        close(F);
        %fdata2 = ();
        open(F, "<$ARGV[2]");
        &netstatfile(\%fdata2);
        close(F);

}

foreach $k (keys %fdata1) {
        if ($fdata1{$k} ne $fdata2{$k}) {
                print "$k\t".int($fdata2{$k} -
$fdata1{$k})."\n";
        }
}

exit(0);

sub netstatfile {
        while (<F>) {
                chop;
                if (/^(\S+)/i) {
                        s/^$1//;
                }
                if (/=/) {
                        while (/\s+/) {
                                s/\s+//;
                        }
                        if (/^([a-z]+)=(\d+)/i) {
                                $_[0]{$1} = $2;
                                s/$1=$2//;
                        }
                        if (/^([a-z]+)=(\d+)/i) {
                                $_[0]{$1} = $2;
                                s/$1=$2//;
                        }
                }
        }
}

Sunay Tripathi

2009-Jun-26 20:04 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Darren,

I appreciate the fact that you wrote a script in 15 min. Do
you expect our end users each to write a script or a program
and be well versed in sheel/awk to get the basic information
they want i.e. cumulative stats and the rate of change? I
think we really need to make it easy for people to get that
info.

I like Jim''s suggestion of putting the functionality in a
library so other sub commands can use it as well. It
doesn''t need to be a standalone library if all we have is
few things. But keep these separate (at least in source
files) so later if they need to be pulled into one library,
it will be easy.

Cheers,
Sunay


Darren Reed wrote:> James Carlson wrote:
>> Darren Reed writes:
>>> When I''ve had an Internet connection via an ISP that
>>> meters traffic (you have a ''quota'' of x
gb/month),
>>> I''ve run a cron job on the end-of-month day to dump
>>> out the output from ipfstat and clear the counters
>>> back to 0. Thus at any time during the month, using
>>> ipfstat gives me a reasonable approximation of the
>>> current quota use.
>>
>> I think it''d be nice if that cron job didn''t have to
run with
>> privileges sufficient to write to the kernel.
>>
>> I agree that the delta tool doesn''t necessarily have to be
part of
>> dlstat itself.  Perhaps if it were made a common library (along with
>> the rest of the column-formatting stuff that Brussels cleaned up),
>> that would address most of the concerns.
> 
> A C library?
> 
> To me that seems like making a mountain out of a mole hill...
> 
> 15 minutes and I came up with a perl script to print out what
> has changed between two dumps of netstat output - and that''s
> just a very basic script.
> 
> How long would it take to write in C? Substantially longer
> because you''ve got to go through all sorts of hoops to handle
> text input safely. For those that don''t like perl, well, use
> python or your other favourite scripting langauge.
> 
> Darren
> 
> ./statdiff netstat st.1 st.2
> tcpInAckBytes   308
> ipInReceives    6
> ipOutRequests   5
> tcpOutDataSegs  5
> tcpOutDataBytes 308
> tcpOutSegs      5
> tcpRttUpdate    4
> tcpInInorderSegs        4
> ipInDelivers    6
> tcpInAckSegs    4
> tcpInSegs       6
> tcpInInorderBytes       208
> 
> 
> #!/bin/perl
> 
> if ($#ARGV ne 2) {
>        print STDERR "Usaage: <format> <file1>
<file2>\n";
>        exit(1);
> }
> 
> if ($ARGV[0] == "netstat") {
>        %fdata1 = ();
>        open(F, "<$ARGV[1]");
>        &netstatfile(\%fdata1);
>        close(F);
>        %fdata2 = ();
>        open(F, "<$ARGV[2]");
>        &netstatfile(\%fdata2);
>        close(F);
> 
> }
> 
> foreach $k (keys %fdata1) {
>        if ($fdata1{$k} ne $fdata2{$k}) {
>                print "$k\t".int($fdata2{$k} -
$fdata1{$k})."\n";
>        }
> }
> 
> exit(0);
> 
> sub netstatfile {
>        while (<F>) {
>                chop;
>                if (/^(\S+)/i) {
>                        s/^$1//;
>                }
>                if (/=/) {
>                        while (/\s+/) {
>                                s/\s+//;
>                        }
>                        if (/^([a-z]+)=(\d+)/i) {
>                                $_[0]{$1} = $2;
>                                s/$1=$2//;
>                        }
>                        if (/^([a-z]+)=(\d+)/i) {
>                                $_[0]{$1} = $2;
>                                s/$1=$2//;
>                        }
>                }
>        }
> }
> 
> _______________________________________________
> crossbow-discuss mailing list
> crossbow-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss

-- 
Sunay Tripathi
Distinguished Engineer
Solaris Core Operating System
Sun MicroSystems Inc.

Solaris Networking:     http://www.opensolaris.org/os/community/networking
Project Crossbow:       http://www.opensolaris.org/os/project/crossbow

Garrett D''Amore

2009-Jun-26 22:29 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Architecturally, parsing the output from netstat is the wrong answer.

But I''m not convinced that we need to go and provide some rich facility
here.  Again, the tendancy from certain engineering groups is to over 
design things.  Please don''t do that.

A good example of a simple facility is to just offer a very simple 
delimited (or single value) output.

Here''s a good example:

kstat -p rge:0:mac:opackets

Now I can trivially take two values and diff them together, using either 
a calculator, spreadsheet, or a couple lines of shell:

kstat -p rge:0::opackets | cut -f2 > /tmp/snapshot

do some work for a bit...

expr `kstat -p rge:0::opackets | cut -f2` - `cat /tmp/snapshot`


That took only two lines.  And frankly, except for the desire to 
demonstrate it, I''d probably have just used a calculator or expr on the
command line instead of trying to automate it with parseable output and cut.

We don''t need to invent all kinds of different ways to analyze data in 
the tools.  Let customers do that for themselves.  Because once you 
start doing it, you''ll have a never ending series of RFEs for different
kinds of analysis.  Just don''t go there.

(Or if you really want to go there, look at doing it from within 
something like the analytics tools that are part of fishworks... don''t 
burden the command line tools with this penalty!)

    - Garrett

Sunay Tripathi wrote:> Darren,
>
> I appreciate the fact that you wrote a script in 15 min. Do
> you expect our end users each to write a script or a program
> and be well versed in sheel/awk to get the basic information
> they want i.e. cumulative stats and the rate of change? I
> think we really need to make it easy for people to get that
> info.
>
> I like Jim''s suggestion of putting the functionality in a
> library so other sub commands can use it as well. It
> doesn''t need to be a standalone library if all we have is
> few things. But keep these separate (at least in source
> files) so later if they need to be pulled into one library,
> it will be easy.
>
> Cheers,
> Sunay
>
>
> Darren Reed wrote:
>> James Carlson wrote:
>>> Darren Reed writes:
>>>> When I''ve had an Internet connection via an ISP that
>>>> meters traffic (you have a ''quota'' of x
gb/month),
>>>> I''ve run a cron job on the end-of-month day to dump
>>>> out the output from ipfstat and clear the counters
>>>> back to 0. Thus at any time during the month, using
>>>> ipfstat gives me a reasonable approximation of the
>>>> current quota use.
>>>
>>> I think it''d be nice if that cron job didn''t have
to run with
>>> privileges sufficient to write to the kernel.
>>>
>>> I agree that the delta tool doesn''t necessarily have to be
part of
>>> dlstat itself.  Perhaps if it were made a common library (along
with
>>> the rest of the column-formatting stuff that Brussels cleaned up),
>>> that would address most of the concerns.
>>
>> A C library?
>>
>> To me that seems like making a mountain out of a mole hill...
>>
>> 15 minutes and I came up with a perl script to print out what
>> has changed between two dumps of netstat output - and that''s
>> just a very basic script.
>>
>> How long would it take to write in C? Substantially longer
>> because you''ve got to go through all sorts of hoops to handle
>> text input safely. For those that don''t like perl, well, use
>> python or your other favourite scripting langauge.
>>
>> Darren
>>
>> ./statdiff netstat st.1 st.2
>> tcpInAckBytes   308
>> ipInReceives    6
>> ipOutRequests   5
>> tcpOutDataSegs  5
>> tcpOutDataBytes 308
>> tcpOutSegs      5
>> tcpRttUpdate    4
>> tcpInInorderSegs        4
>> ipInDelivers    6
>> tcpInAckSegs    4
>> tcpInSegs       6
>> tcpInInorderBytes       208
>>
>>
>> #!/bin/perl
>>
>> if ($#ARGV ne 2) {
>>        print STDERR "Usaage: <format> <file1>
<file2>\n";
>>        exit(1);
>> }
>>
>> if ($ARGV[0] == "netstat") {
>>        %fdata1 = ();
>>        open(F, "<$ARGV[1]");
>>        &netstatfile(\%fdata1);
>>        close(F);
>>        %fdata2 = ();
>>        open(F, "<$ARGV[2]");
>>        &netstatfile(\%fdata2);
>>        close(F);
>>
>> }
>>
>> foreach $k (keys %fdata1) {
>>        if ($fdata1{$k} ne $fdata2{$k}) {
>>                print "$k\t".int($fdata2{$k} -
$fdata1{$k})."\n";
>>        }
>> }
>>
>> exit(0);
>>
>> sub netstatfile {
>>        while (<F>) {
>>                chop;
>>                if (/^(\S+)/i) {
>>                        s/^$1//;
>>                }
>>                if (/=/) {
>>                        while (/\s+/) {
>>                                s/\s+//;
>>                        }
>>                        if (/^([a-z]+)=(\d+)/i) {
>>                                $_[0]{$1} = $2;
>>                                s/$1=$2//;
>>                        }
>>                        if (/^([a-z]+)=(\d+)/i) {
>>                                $_[0]{$1} = $2;
>>                                s/$1=$2//;
>>                        }
>>                }
>>        }
>> }
>>
>> _______________________________________________
>> crossbow-discuss mailing list
>> crossbow-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
>
>

Sunay Tripathi

2009-Jun-26 22:47 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

And you would expect a user trying to see his usage/stat change on a per
second basis under load to keep doing this? This is pretty much what
we get asked by every customer. Observing change as workload varies is
important to people. I agree with not trying to over engineer but do
think in terms of usability. Listening to your non engineer users once
in a while is not such a bad thing :)

Cheers,
Sunay

Garrett D''Amore wrote:> Architecturally, parsing the output from netstat is the wrong answer.
> 
> But I''m not convinced that we need to go and provide some rich
facility
> here.  Again, the tendancy from certain engineering groups is to over 
> design things.  Please don''t do that.
> 
> A good example of a simple facility is to just offer a very simple 
> delimited (or single value) output.
> 
> Here''s a good example:
> 
> kstat -p rge:0:mac:opackets
> 
> Now I can trivially take two values and diff them together, using either 
> a calculator, spreadsheet, or a couple lines of shell:
> 
> kstat -p rge:0::opackets | cut -f2 > /tmp/snapshot
> 
> do some work for a bit...
> 
> expr `kstat -p rge:0::opackets | cut -f2` - `cat /tmp/snapshot`
> 
> 
> That took only two lines.  And frankly, except for the desire to 
> demonstrate it, I''d probably have just used a calculator or expr
on the
> command line instead of trying to automate it with parseable output and 
> cut.
> 
> We don''t need to invent all kinds of different ways to analyze
data in
> the tools.  Let customers do that for themselves.  Because once you 
> start doing it, you''ll have a never ending series of RFEs for
different
> kinds of analysis.  Just don''t go there.
> 
> (Or if you really want to go there, look at doing it from within 
> something like the analytics tools that are part of fishworks...
don''t
> burden the command line tools with this penalty!)
> 
>    - Garrett
> 
> Sunay Tripathi wrote:
>> Darren,
>>
>> I appreciate the fact that you wrote a script in 15 min. Do
>> you expect our end users each to write a script or a program
>> and be well versed in sheel/awk to get the basic information
>> they want i.e. cumulative stats and the rate of change? I
>> think we really need to make it easy for people to get that
>> info.
>>
>> I like Jim''s suggestion of putting the functionality in a
>> library so other sub commands can use it as well. It
>> doesn''t need to be a standalone library if all we have is
>> few things. But keep these separate (at least in source
>> files) so later if they need to be pulled into one library,
>> it will be easy.
>>
>> Cheers,
>> Sunay
>>
>>
>> Darren Reed wrote:
>>> James Carlson wrote:
>>>> Darren Reed writes:
>>>>> When I''ve had an Internet connection via an ISP
that
>>>>> meters traffic (you have a ''quota'' of x
gb/month),
>>>>> I''ve run a cron job on the end-of-month day to
dump
>>>>> out the output from ipfstat and clear the counters
>>>>> back to 0. Thus at any time during the month, using
>>>>> ipfstat gives me a reasonable approximation of the
>>>>> current quota use.
>>>>
>>>> I think it''d be nice if that cron job didn''t
have to run with
>>>> privileges sufficient to write to the kernel.
>>>>
>>>> I agree that the delta tool doesn''t necessarily have
to be part of
>>>> dlstat itself.  Perhaps if it were made a common library (along
with
>>>> the rest of the column-formatting stuff that Brussels cleaned
up),
>>>> that would address most of the concerns.
>>>
>>> A C library?
>>>
>>> To me that seems like making a mountain out of a mole hill...
>>>
>>> 15 minutes and I came up with a perl script to print out what
>>> has changed between two dumps of netstat output - and
that''s
>>> just a very basic script.
>>>
>>> How long would it take to write in C? Substantially longer
>>> because you''ve got to go through all sorts of hoops to
handle
>>> text input safely. For those that don''t like perl, well,
use
>>> python or your other favourite scripting langauge.
>>>
>>> Darren
>>>
>>> ./statdiff netstat st.1 st.2
>>> tcpInAckBytes   308
>>> ipInReceives    6
>>> ipOutRequests   5
>>> tcpOutDataSegs  5
>>> tcpOutDataBytes 308
>>> tcpOutSegs      5
>>> tcpRttUpdate    4
>>> tcpInInorderSegs        4
>>> ipInDelivers    6
>>> tcpInAckSegs    4
>>> tcpInSegs       6
>>> tcpInInorderBytes       208
>>>
>>>
>>> #!/bin/perl
>>>
>>> if ($#ARGV ne 2) {
>>>        print STDERR "Usaage: <format> <file1>
<file2>\n";
>>>        exit(1);
>>> }
>>>
>>> if ($ARGV[0] == "netstat") {
>>>        %fdata1 = ();
>>>        open(F, "<$ARGV[1]");
>>>        &netstatfile(\%fdata1);
>>>        close(F);
>>>        %fdata2 = ();
>>>        open(F, "<$ARGV[2]");
>>>        &netstatfile(\%fdata2);
>>>        close(F);
>>>
>>> }
>>>
>>> foreach $k (keys %fdata1) {
>>>        if ($fdata1{$k} ne $fdata2{$k}) {
>>>                print "$k\t".int($fdata2{$k} -
$fdata1{$k})."\n";
>>>        }
>>> }
>>>
>>> exit(0);
>>>
>>> sub netstatfile {
>>>        while (<F>) {
>>>                chop;
>>>                if (/^(\S+)/i) {
>>>                        s/^$1//;
>>>                }
>>>                if (/=/) {
>>>                        while (/\s+/) {
>>>                                s/\s+//;
>>>                        }
>>>                        if (/^([a-z]+)=(\d+)/i) {
>>>                                $_[0]{$1} = $2;
>>>                                s/$1=$2//;
>>>                        }
>>>                        if (/^([a-z]+)=(\d+)/i) {
>>>                                $_[0]{$1} = $2;
>>>                                s/$1=$2//;
>>>                        }
>>>                }
>>>        }
>>> }
>>>
>>> _______________________________________________
>>> crossbow-discuss mailing list
>>> crossbow-discuss at opensolaris.org
>>> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
>>
>>

-- 
Sunay Tripathi
Distinguished Engineer
Solaris Core Operating System
Sun MicroSystems Inc.

Solaris Networking:     http://www.opensolaris.org/os/community/networking
Project Crossbow:       http://www.opensolaris.org/os/project/crossbow

Garrett D''Amore

2009-Jun-26 23:38 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Sunay Tripathi wrote:> And you would expect a user trying to see his usage/stat change on a per
> second basis under load to keep doing this? This is pretty much what
> we get asked by every customer. Observing change as workload varies is
> important to people. I agree with not trying to over engineer but do
> think in terms of usability. Listening to your non engineer users once
> in a while is not such a bad thing :)
So build a full-on graphical analysis tool like the amazing stuff they 
are doing with DTrace for the storage products.  Trying to band-aid 
around a problem than any junior sysadmin (and most end users) can 
script around in five minutes is not really solving anything.

Keep the underlying tools simple.  Build on top of them whatever 
analysis you want.

    - Garrett
>
> Cheers,
> Sunay
>
> Garrett D''Amore wrote:
>> Architecturally, parsing the output from netstat is the wrong answer.
>>
>> But I''m not convinced that we need to go and provide some rich
>> facility here.  Again, the tendancy from certain engineering groups 
>> is to over design things.  Please don''t do that.
>>
>> A good example of a simple facility is to just offer a very simple 
>> delimited (or single value) output.
>>
>> Here''s a good example:
>>
>> kstat -p rge:0:mac:opackets
>>
>> Now I can trivially take two values and diff them together, using 
>> either a calculator, spreadsheet, or a couple lines of shell:
>>
>> kstat -p rge:0::opackets | cut -f2 > /tmp/snapshot
>>
>> do some work for a bit...
>>
>> expr `kstat -p rge:0::opackets | cut -f2` - `cat /tmp/snapshot`
>>
>>
>> That took only two lines.  And frankly, except for the desire to 
>> demonstrate it, I''d probably have just used a calculator or
expr on
>> the command line instead of trying to automate it with parseable 
>> output and cut.
>>
>> We don''t need to invent all kinds of different ways to analyze
data
>> in the tools.  Let customers do that for themselves.  Because once 
>> you start doing it, you''ll have a never ending series of RFEs
for
>> different kinds of analysis.  Just don''t go there.
>>
>> (Or if you really want to go there, look at doing it from within 
>> something like the analytics tools that are part of fishworks... 
>> don''t burden the command line tools with this penalty!)
>>
>>    - Garrett
>>
>> Sunay Tripathi wrote:
>>> Darren,
>>>
>>> I appreciate the fact that you wrote a script in 15 min. Do
>>> you expect our end users each to write a script or a program
>>> and be well versed in sheel/awk to get the basic information
>>> they want i.e. cumulative stats and the rate of change? I
>>> think we really need to make it easy for people to get that
>>> info.
>>>
>>> I like Jim''s suggestion of putting the functionality in a
>>> library so other sub commands can use it as well. It
>>> doesn''t need to be a standalone library if all we have is
>>> few things. But keep these separate (at least in source
>>> files) so later if they need to be pulled into one library,
>>> it will be easy.
>>>
>>> Cheers,
>>> Sunay
>>>
>>>
>>> Darren Reed wrote:
>>>> James Carlson wrote:
>>>>> Darren Reed writes:
>>>>>> When I''ve had an Internet connection via an
ISP that
>>>>>> meters traffic (you have a ''quota'' of
x gb/month),
>>>>>> I''ve run a cron job on the end-of-month day to
dump
>>>>>> out the output from ipfstat and clear the counters
>>>>>> back to 0. Thus at any time during the month, using
>>>>>> ipfstat gives me a reasonable approximation of the
>>>>>> current quota use.
>>>>>
>>>>> I think it''d be nice if that cron job
didn''t have to run with
>>>>> privileges sufficient to write to the kernel.
>>>>>
>>>>> I agree that the delta tool doesn''t necessarily
have to be part of
>>>>> dlstat itself.  Perhaps if it were made a common library
(along with
>>>>> the rest of the column-formatting stuff that Brussels
cleaned up),
>>>>> that would address most of the concerns.
>>>>
>>>> A C library?
>>>>
>>>> To me that seems like making a mountain out of a mole hill...
>>>>
>>>> 15 minutes and I came up with a perl script to print out what
>>>> has changed between two dumps of netstat output - and
that''s
>>>> just a very basic script.
>>>>
>>>> How long would it take to write in C? Substantially longer
>>>> because you''ve got to go through all sorts of hoops to
handle
>>>> text input safely. For those that don''t like perl,
well, use
>>>> python or your other favourite scripting langauge.
>>>>
>>>> Darren
>>>>
>>>> ./statdiff netstat st.1 st.2
>>>> tcpInAckBytes   308
>>>> ipInReceives    6
>>>> ipOutRequests   5
>>>> tcpOutDataSegs  5
>>>> tcpOutDataBytes 308
>>>> tcpOutSegs      5
>>>> tcpRttUpdate    4
>>>> tcpInInorderSegs        4
>>>> ipInDelivers    6
>>>> tcpInAckSegs    4
>>>> tcpInSegs       6
>>>> tcpInInorderBytes       208
>>>>
>>>>
>>>> #!/bin/perl
>>>>
>>>> if ($#ARGV ne 2) {
>>>>        print STDERR "Usaage: <format> <file1>
<file2>\n";
>>>>        exit(1);
>>>> }
>>>>
>>>> if ($ARGV[0] == "netstat") {
>>>>        %fdata1 = ();
>>>>        open(F, "<$ARGV[1]");
>>>>        &netstatfile(\%fdata1);
>>>>        close(F);
>>>>        %fdata2 = ();
>>>>        open(F, "<$ARGV[2]");
>>>>        &netstatfile(\%fdata2);
>>>>        close(F);
>>>>
>>>> }
>>>>
>>>> foreach $k (keys %fdata1) {
>>>>        if ($fdata1{$k} ne $fdata2{$k}) {
>>>>                print "$k\t".int($fdata2{$k} -
$fdata1{$k})."\n";
>>>>        }
>>>> }
>>>>
>>>> exit(0);
>>>>
>>>> sub netstatfile {
>>>>        while (<F>) {
>>>>                chop;
>>>>                if (/^(\S+)/i) {
>>>>                        s/^$1//;
>>>>                }
>>>>                if (/=/) {
>>>>                        while (/\s+/) {
>>>>                                s/\s+//;
>>>>                        }
>>>>                        if (/^([a-z]+)=(\d+)/i) {
>>>>                                $_[0]{$1} = $2;
>>>>                                s/$1=$2//;
>>>>                        }
>>>>                        if (/^([a-z]+)=(\d+)/i) {
>>>>                                $_[0]{$1} = $2;
>>>>                                s/$1=$2//;
>>>>                        }
>>>>                }
>>>>        }
>>>> }
>>>>
>>>> _______________________________________________
>>>> crossbow-discuss mailing list
>>>> crossbow-discuss at opensolaris.org
>>>> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
>>>
>>>
>
>

Sunay Tripathi

2009-Jun-27 00:41 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Garrett D''Amore wrote:> Sunay Tripathi wrote:
>> And you would expect a user trying to see his usage/stat change on a
per
>> second basis under load to keep doing this? This is pretty much what
>> we get asked by every customer. Observing change as workload varies is
>> important to people. I agree with not trying to over engineer but do
>> think in terms of usability. Listening to your non engineer users once
>> in a while is not such a bad thing :)
> 
> So build a full-on graphical analysis tool like the amazing stuff they 
> are doing with DTrace for the storage products.  Trying to band-aid 
> around a problem than any junior sysadmin (and most end users) can 
> script around in five minutes is not really solving anything.
> 
> Keep the underlying tools simple.  Build on top of them whatever 
> analysis you want.
Interesting. So either do the grand solution or not even bother to
make incremental improvements.

Anyway, instead of rat holing on this, can you provide a more
concise reason why you think providing diffs or per second basis
complicate things. Please do understand the entire goal of
dlstat/flowstat is to help administrators see the stats in
easy to consume manner.

The feature you are saying we shouldn''t
do has been requested by over 20+ Crossbow customers. dlstat/flowstat
and l2 protection is most requested feature from current customers
who are deploying or considering deploying Crossbow.

Thanks,
Sunay
> 
>    - Garrett
> 
>>
>> Cheers,
>> Sunay
>>
>> Garrett D''Amore wrote:
>>> Architecturally, parsing the output from netstat is the wrong
answer.
>>>
>>> But I''m not convinced that we need to go and provide some
rich
>>> facility here.  Again, the tendancy from certain engineering groups
>>> is to over design things.  Please don''t do that.
>>>
>>> A good example of a simple facility is to just offer a very simple 
>>> delimited (or single value) output.
>>>
>>> Here''s a good example:
>>>
>>> kstat -p rge:0:mac:opackets
>>>
>>> Now I can trivially take two values and diff them together, using 
>>> either a calculator, spreadsheet, or a couple lines of shell:
>>>
>>> kstat -p rge:0::opackets | cut -f2 > /tmp/snapshot
>>>
>>> do some work for a bit...
>>>
>>> expr `kstat -p rge:0::opackets | cut -f2` - `cat /tmp/snapshot`
>>>
>>>
>>> That took only two lines.  And frankly, except for the desire to 
>>> demonstrate it, I''d probably have just used a calculator
or expr on
>>> the command line instead of trying to automate it with parseable 
>>> output and cut.
>>>
>>> We don''t need to invent all kinds of different ways to
analyze data
>>> in the tools.  Let customers do that for themselves.  Because once 
>>> you start doing it, you''ll have a never ending series of
RFEs for
>>> different kinds of analysis.  Just don''t go there.
>>>
>>> (Or if you really want to go there, look at doing it from within 
>>> something like the analytics tools that are part of fishworks... 
>>> don''t burden the command line tools with this penalty!)
>>>
>>>    - Garrett
>>>
>>> Sunay Tripathi wrote:
>>>> Darren,
>>>>
>>>> I appreciate the fact that you wrote a script in 15 min. Do
>>>> you expect our end users each to write a script or a program
>>>> and be well versed in sheel/awk to get the basic information
>>>> they want i.e. cumulative stats and the rate of change? I
>>>> think we really need to make it easy for people to get that
>>>> info.
>>>>
>>>> I like Jim''s suggestion of putting the functionality
in a
>>>> library so other sub commands can use it as well. It
>>>> doesn''t need to be a standalone library if all we have
is
>>>> few things. But keep these separate (at least in source
>>>> files) so later if they need to be pulled into one library,
>>>> it will be easy.
>>>>
>>>> Cheers,
>>>> Sunay
>>>>
>>>>
>>>> Darren Reed wrote:
>>>>> James Carlson wrote:
>>>>>> Darren Reed writes:
>>>>>>> When I''ve had an Internet connection via
an ISP that
>>>>>>> meters traffic (you have a
''quota'' of x gb/month),
>>>>>>> I''ve run a cron job on the end-of-month
day to dump
>>>>>>> out the output from ipfstat and clear the counters
>>>>>>> back to 0. Thus at any time during the month, using
>>>>>>> ipfstat gives me a reasonable approximation of the
>>>>>>> current quota use.
>>>>>>
>>>>>> I think it''d be nice if that cron job
didn''t have to run with
>>>>>> privileges sufficient to write to the kernel.
>>>>>>
>>>>>> I agree that the delta tool doesn''t
necessarily have to be part of
>>>>>> dlstat itself.  Perhaps if it were made a common
library (along with
>>>>>> the rest of the column-formatting stuff that Brussels
cleaned up),
>>>>>> that would address most of the concerns.
>>>>>
>>>>> A C library?
>>>>>
>>>>> To me that seems like making a mountain out of a mole
hill...
>>>>>
>>>>> 15 minutes and I came up with a perl script to print out
what
>>>>> has changed between two dumps of netstat output - and
that''s
>>>>> just a very basic script.
>>>>>
>>>>> How long would it take to write in C? Substantially longer
>>>>> because you''ve got to go through all sorts of
hoops to handle
>>>>> text input safely. For those that don''t like perl,
well, use
>>>>> python or your other favourite scripting langauge.
>>>>>
>>>>> Darren
>>>>>
>>>>> ./statdiff netstat st.1 st.2
>>>>> tcpInAckBytes   308
>>>>> ipInReceives    6
>>>>> ipOutRequests   5
>>>>> tcpOutDataSegs  5
>>>>> tcpOutDataBytes 308
>>>>> tcpOutSegs      5
>>>>> tcpRttUpdate    4
>>>>> tcpInInorderSegs        4
>>>>> ipInDelivers    6
>>>>> tcpInAckSegs    4
>>>>> tcpInSegs       6
>>>>> tcpInInorderBytes       208
>>>>>
>>>>>
>>>>> #!/bin/perl
>>>>>
>>>>> if ($#ARGV ne 2) {
>>>>>        print STDERR "Usaage: <format>
<file1> <file2>\n";
>>>>>        exit(1);
>>>>> }
>>>>>
>>>>> if ($ARGV[0] == "netstat") {
>>>>>        %fdata1 = ();
>>>>>        open(F, "<$ARGV[1]");
>>>>>        &netstatfile(\%fdata1);
>>>>>        close(F);
>>>>>        %fdata2 = ();
>>>>>        open(F, "<$ARGV[2]");
>>>>>        &netstatfile(\%fdata2);
>>>>>        close(F);
>>>>>
>>>>> }
>>>>>
>>>>> foreach $k (keys %fdata1) {
>>>>>        if ($fdata1{$k} ne $fdata2{$k}) {
>>>>>                print "$k\t".int($fdata2{$k} -
$fdata1{$k})."\n";
>>>>>        }
>>>>> }
>>>>>
>>>>> exit(0);
>>>>>
>>>>> sub netstatfile {
>>>>>        while (<F>) {
>>>>>                chop;
>>>>>                if (/^(\S+)/i) {
>>>>>                        s/^$1//;
>>>>>                }
>>>>>                if (/=/) {
>>>>>                        while (/\s+/) {
>>>>>                                s/\s+//;
>>>>>                        }
>>>>>                        if (/^([a-z]+)=(\d+)/i) {
>>>>>                                $_[0]{$1} = $2;
>>>>>                                s/$1=$2//;
>>>>>                        }
>>>>>                        if (/^([a-z]+)=(\d+)/i) {
>>>>>                                $_[0]{$1} = $2;
>>>>>                                s/$1=$2//;
>>>>>                        }
>>>>>                }
>>>>>        }
>>>>> }
>>>>>
>>>>> _______________________________________________
>>>>> crossbow-discuss mailing list
>>>>> crossbow-discuss at opensolaris.org
>>>>>
http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
>>>>
>>>>
>>
>>

-- 
Sunay Tripathi
Distinguished Engineer
Solaris Core Operating System
Sun MicroSystems Inc.

Solaris Networking:     http://www.opensolaris.org/os/community/networking
Project Crossbow:       http://www.opensolaris.org/os/project/crossbow

Garrett D''Amore

2009-Jun-27 01:05 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Sunay Tripathi wrote:> Garrett D''Amore wrote:
>> Sunay Tripathi wrote:
>>> And you would expect a user trying to see his usage/stat change on
a
>>> per
>>> second basis under load to keep doing this? This is pretty much
what
>>> we get asked by every customer. Observing change as workload varies
is
>>> important to people. I agree with not trying to over engineer but
do
>>> think in terms of usability. Listening to your non engineer users
once
>>> in a while is not such a bad thing :)
>>
>> So build a full-on graphical analysis tool like the amazing stuff 
>> they are doing with DTrace for the storage products.  Trying to 
>> band-aid around a problem than any junior sysadmin (and most end 
>> users) can script around in five minutes is not really solving
anything.
>>
>> Keep the underlying tools simple.  Build on top of them whatever 
>> analysis you want.
>
> Interesting. So either do the grand solution or not even bother to
> make incremental improvements.
>
> Anyway, instead of rat holing on this, can you provide a more
> concise reason why you think providing diffs or per second basis
> complicate things. Please do understand the entire goal of
> dlstat/flowstat is to help administrators see the stats in
> easy to consume manner.
>
> The feature you are saying we shouldn''t
> do has been requested by over 20+ Crossbow customers. dlstat/flowstat
> and l2 protection is most requested feature from current customers
> who are deploying or considering deploying Crossbow.
I''m not saying you shouldn''t provide the stats, I''m
saying that dlstat
should probably keep it simple and report the stats and maybe running 
diffs like vmstat and netstat do.

Trying to build snapshot facilities and tools to do sophisticated 
analysis into what should be a very simple CLI tool is the wrong approach.

So I''m not saying that sophisticated analysis isn''t useful. 
I''m saying
that dlstat and flowstat is the wrong place.  They should be the 
building blocks to collect the stats; if you want to do fancy analysis 
then do it "right" with a nice GUI wizard.

Now, that said, are customers being very explicit in what they are 
asking for?  Are they specifically asking for the tool to provide 
arbitrary snapshotting facilities?  I suspect not -- I think its more 
likely that they are asking for some very general stuff ("we would like 
to be able to figure out how our resources are being utilized without 
having to learn DTrace") and engineering is turning that into a 
requirement that isn''t really what the customer is after.

    - Garrett
>
> Thanks,
> Sunay
>
>>
>>    - Garrett
>>
>>>
>>> Cheers,
>>> Sunay
>>>
>>> Garrett D''Amore wrote:
>>>> Architecturally, parsing the output from netstat is the wrong
answer.
>>>>
>>>> But I''m not convinced that we need to go and provide
some rich
>>>> facility here.  Again, the tendancy from certain engineering
groups
>>>> is to over design things.  Please don''t do that.
>>>>
>>>> A good example of a simple facility is to just offer a very
simple
>>>> delimited (or single value) output.
>>>>
>>>> Here''s a good example:
>>>>
>>>> kstat -p rge:0:mac:opackets
>>>>
>>>> Now I can trivially take two values and diff them together,
using
>>>> either a calculator, spreadsheet, or a couple lines of shell:
>>>>
>>>> kstat -p rge:0::opackets | cut -f2 > /tmp/snapshot
>>>>
>>>> do some work for a bit...
>>>>
>>>> expr `kstat -p rge:0::opackets | cut -f2` - `cat /tmp/snapshot`
>>>>
>>>>
>>>> That took only two lines.  And frankly, except for the desire
to
>>>> demonstrate it, I''d probably have just used a
calculator or expr on
>>>> the command line instead of trying to automate it with
parseable
>>>> output and cut.
>>>>
>>>> We don''t need to invent all kinds of different ways to
analyze data
>>>> in the tools.  Let customers do that for themselves.  Because
once
>>>> you start doing it, you''ll have a never ending series
of RFEs for
>>>> different kinds of analysis.  Just don''t go there.
>>>>
>>>> (Or if you really want to go there, look at doing it from
within
>>>> something like the analytics tools that are part of
fishworks...
>>>> don''t burden the command line tools with this
penalty!)
>>>>
>>>>    - Garrett
>>>>
>>>> Sunay Tripathi wrote:
>>>>> Darren,
>>>>>
>>>>> I appreciate the fact that you wrote a script in 15 min. Do
>>>>> you expect our end users each to write a script or a
program
>>>>> and be well versed in sheel/awk to get the basic
information
>>>>> they want i.e. cumulative stats and the rate of change? I
>>>>> think we really need to make it easy for people to get that
>>>>> info.
>>>>>
>>>>> I like Jim''s suggestion of putting the
functionality in a
>>>>> library so other sub commands can use it as well. It
>>>>> doesn''t need to be a standalone library if all we
have is
>>>>> few things. But keep these separate (at least in source
>>>>> files) so later if they need to be pulled into one library,
>>>>> it will be easy.
>>>>>
>>>>> Cheers,
>>>>> Sunay
>>>>>
>>>>>
>>>>> Darren Reed wrote:
>>>>>> James Carlson wrote:
>>>>>>> Darren Reed writes:
>>>>>>>> When I''ve had an Internet connection
via an ISP that
>>>>>>>> meters traffic (you have a
''quota'' of x gb/month),
>>>>>>>> I''ve run a cron job on the
end-of-month day to dump
>>>>>>>> out the output from ipfstat and clear the
counters
>>>>>>>> back to 0. Thus at any time during the month,
using
>>>>>>>> ipfstat gives me a reasonable approximation of
the
>>>>>>>> current quota use.
>>>>>>>
>>>>>>> I think it''d be nice if that cron job
didn''t have to run with
>>>>>>> privileges sufficient to write to the kernel.
>>>>>>>
>>>>>>> I agree that the delta tool doesn''t
necessarily have to be part of
>>>>>>> dlstat itself.  Perhaps if it were made a common
library (along
>>>>>>> with
>>>>>>> the rest of the column-formatting stuff that
Brussels cleaned up),
>>>>>>> that would address most of the concerns.
>>>>>>
>>>>>> A C library?
>>>>>>
>>>>>> To me that seems like making a mountain out of a mole
hill...
>>>>>>
>>>>>> 15 minutes and I came up with a perl script to print
out what
>>>>>> has changed between two dumps of netstat output - and
that''s
>>>>>> just a very basic script.
>>>>>>
>>>>>> How long would it take to write in C? Substantially
longer
>>>>>> because you''ve got to go through all sorts of
hoops to handle
>>>>>> text input safely. For those that don''t like
perl, well, use
>>>>>> python or your other favourite scripting langauge.
>>>>>>
>>>>>> Darren
>>>>>>
>>>>>> ./statdiff netstat st.1 st.2
>>>>>> tcpInAckBytes   308
>>>>>> ipInReceives    6
>>>>>> ipOutRequests   5
>>>>>> tcpOutDataSegs  5
>>>>>> tcpOutDataBytes 308
>>>>>> tcpOutSegs      5
>>>>>> tcpRttUpdate    4
>>>>>> tcpInInorderSegs        4
>>>>>> ipInDelivers    6
>>>>>> tcpInAckSegs    4
>>>>>> tcpInSegs       6
>>>>>> tcpInInorderBytes       208
>>>>>>
>>>>>>
>>>>>> #!/bin/perl
>>>>>>
>>>>>> if ($#ARGV ne 2) {
>>>>>>        print STDERR "Usaage: <format>
<file1> <file2>\n";
>>>>>>        exit(1);
>>>>>> }
>>>>>>
>>>>>> if ($ARGV[0] == "netstat") {
>>>>>>        %fdata1 = ();
>>>>>>        open(F, "<$ARGV[1]");
>>>>>>        &netstatfile(\%fdata1);
>>>>>>        close(F);
>>>>>>        %fdata2 = ();
>>>>>>        open(F, "<$ARGV[2]");
>>>>>>        &netstatfile(\%fdata2);
>>>>>>        close(F);
>>>>>>
>>>>>> }
>>>>>>
>>>>>> foreach $k (keys %fdata1) {
>>>>>>        if ($fdata1{$k} ne $fdata2{$k}) {
>>>>>>                print "$k\t".int($fdata2{$k} -
$fdata1{$k})."\n";
>>>>>>        }
>>>>>> }
>>>>>>
>>>>>> exit(0);
>>>>>>
>>>>>> sub netstatfile {
>>>>>>        while (<F>) {
>>>>>>                chop;
>>>>>>                if (/^(\S+)/i) {
>>>>>>                        s/^$1//;
>>>>>>                }
>>>>>>                if (/=/) {
>>>>>>                        while (/\s+/) {
>>>>>>                                s/\s+//;
>>>>>>                        }
>>>>>>                        if (/^([a-z]+)=(\d+)/i) {
>>>>>>                                $_[0]{$1} = $2;
>>>>>>                                s/$1=$2//;
>>>>>>                        }
>>>>>>                        if (/^([a-z]+)=(\d+)/i) {
>>>>>>                                $_[0]{$1} = $2;
>>>>>>                                s/$1=$2//;
>>>>>>                        }
>>>>>>                }
>>>>>>        }
>>>>>> }
>>>>>>
>>>>>> _______________________________________________
>>>>>> crossbow-discuss mailing list
>>>>>> crossbow-discuss at opensolaris.org
>>>>>>
http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
>>>>>
>>>>>
>>>
>>>
>
>

Darren Reed

2009-Jun-27 02:16 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Sunay Tripathi wrote:> Darren,
>
> I appreciate the fact that you wrote a script in 15 min. Do
> you expect our end users each to write a script or a program
> and be well versed in sheel/awk to get the basic information
> they want i.e. cumulative stats and the rate of change? I
> think we really need to make it easy for people to get that
> info.
Systems administrators, for whom commands like this are
targetted, are regularly writing scripts in perl, shell, python,
whatever.

So, yes, I do expect them to because I know they do.

Been there, done that - it''s part of the "job description".

While there''s an obvious need for the likes of "dlstat -i 2",
it''s building in the "diff" capability that I''m
questioning.

Chances are that if you''re snapshotting data that you want
to reformat into CSV for a spreadsheet or reformat it for
feeding into gnuplot or the snapshot data is part of some
greater task or whatever.

> I like Jim''s suggestion of putting the functionality in a
> library so other sub commands can use it as well. It
> doesn''t need to be a standalone library if all we have is
> few things. But keep these separate (at least in source
> files) so later if they need to be pulled into one library,
> it will be easy.
I''m with Garrett - you''re over engineering this.

Darren

Sunay Tripathi

2009-Jun-27 02:38 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Garrett D''Amore wrote:> Sunay Tripathi wrote:
>> Garrett D''Amore wrote:
>>> Sunay Tripathi wrote:
>>>> And you would expect a user trying to see his usage/stat change
on a
>>>> per
>>>> second basis under load to keep doing this? This is pretty much
what
>>>> we get asked by every customer. Observing change as workload
varies is
>>>> important to people. I agree with not trying to over engineer
but do
>>>> think in terms of usability. Listening to your non engineer
users once
>>>> in a while is not such a bad thing :)
>>>
>>> So build a full-on graphical analysis tool like the amazing stuff 
>>> they are doing with DTrace for the storage products.  Trying to 
>>> band-aid around a problem than any junior sysadmin (and most end 
>>> users) can script around in five minutes is not really solving
anything.
>>>
>>> Keep the underlying tools simple.  Build on top of them whatever 
>>> analysis you want.
>>
>> Interesting. So either do the grand solution or not even bother to
>> make incremental improvements.
>>
>> Anyway, instead of rat holing on this, can you provide a more
>> concise reason why you think providing diffs or per second basis
>> complicate things. Please do understand the entire goal of
>> dlstat/flowstat is to help administrators see the stats in
>> easy to consume manner.
>>
>> The feature you are saying we shouldn''t
>> do has been requested by over 20+ Crossbow customers. dlstat/flowstat
>> and l2 protection is most requested feature from current customers
>> who are deploying or considering deploying Crossbow.
> 
> I''m not saying you shouldn''t provide the stats,
I''m saying that dlstat
> should probably keep it simple and report the stats and maybe running 
> diffs like vmstat and netstat do.
Exactly. Diffs like the way vmstat/mpstat do on a user specified
interval is a must and is being specifically asked for.
> Trying to build snapshot facilities and tools to do sophisticated 
> analysis into what should be a very simple CLI tool is the wrong approach.
> 
> So I''m not saying that sophisticated analysis isn''t
useful.  I''m saying
> that dlstat and flowstat is the wrong place.  They should be the 
> building blocks to collect the stats; if you want to do fancy analysis 
> then do it "right" with a nice GUI wizard.
OK. I think I misunderstood your stand then. As long as we are clear
that dlstat/flowstat will provide diff at user specified interval, we
can look at the more sophisticated analysis part. Maybe giving the
ability to dump the output in ''csv'' format etc. Lets discuss
some
other option and try to keep that simple.
> Now, that said, are customers being very explicit in what they are 
> asking for?  Are they specifically asking for the tool to provide 
> arbitrary snapshotting facilities?  I suspect not -- I think its more 
> likely that they are asking for some very general stuff ("we would
like
> to be able to figure out how our resources are being utilized without 
> having to learn DTrace") and engineering is turning that into a 
> requirement that isn''t really what the customer is after.
Some of us are directly involved in these conversations so we
know what to ask. The project does seem to be useful to few people ;^)
But you are correct. The requirements are more in line with being able
to get the diffs at specified interval and generally there is demand
for more sophisticated analysis. The amber road appliances allow for
some of the analytics both via a web interface and cli and people
seem to love it. Lets see if we can simplify this a bit.

Cheers,
Sunay
> 
>    - Garrett
> 
>>
>> Thanks,
>> Sunay
>>
>>>
>>>    - Garrett
>>>
>>>>
>>>> Cheers,
>>>> Sunay
>>>>
>>>> Garrett D''Amore wrote:
>>>>> Architecturally, parsing the output from netstat is the
wrong answer.
>>>>>
>>>>> But I''m not convinced that we need to go and
provide some rich
>>>>> facility here.  Again, the tendancy from certain
engineering groups
>>>>> is to over design things.  Please don''t do that.
>>>>>
>>>>> A good example of a simple facility is to just offer a very
simple
>>>>> delimited (or single value) output.
>>>>>
>>>>> Here''s a good example:
>>>>>
>>>>> kstat -p rge:0:mac:opackets
>>>>>
>>>>> Now I can trivially take two values and diff them together,
using
>>>>> either a calculator, spreadsheet, or a couple lines of
shell:
>>>>>
>>>>> kstat -p rge:0::opackets | cut -f2 > /tmp/snapshot
>>>>>
>>>>> do some work for a bit...
>>>>>
>>>>> expr `kstat -p rge:0::opackets | cut -f2` - `cat
/tmp/snapshot`
>>>>>
>>>>>
>>>>> That took only two lines.  And frankly, except for the
desire to
>>>>> demonstrate it, I''d probably have just used a
calculator or expr on
>>>>> the command line instead of trying to automate it with
parseable
>>>>> output and cut.
>>>>>
>>>>> We don''t need to invent all kinds of different
ways to analyze data
>>>>> in the tools.  Let customers do that for themselves. 
Because once
>>>>> you start doing it, you''ll have a never ending
series of RFEs for
>>>>> different kinds of analysis.  Just don''t go there.
>>>>>
>>>>> (Or if you really want to go there, look at doing it from
within
>>>>> something like the analytics tools that are part of
fishworks...
>>>>> don''t burden the command line tools with this
penalty!)
>>>>>
>>>>>    - Garrett
>>>>>
>>>>> Sunay Tripathi wrote:
>>>>>> Darren,
>>>>>>
>>>>>> I appreciate the fact that you wrote a script in 15
min. Do
>>>>>> you expect our end users each to write a script or a
program
>>>>>> and be well versed in sheel/awk to get the basic
information
>>>>>> they want i.e. cumulative stats and the rate of change?
I
>>>>>> think we really need to make it easy for people to get
that
>>>>>> info.
>>>>>>
>>>>>> I like Jim''s suggestion of putting the
functionality in a
>>>>>> library so other sub commands can use it as well. It
>>>>>> doesn''t need to be a standalone library if all
we have is
>>>>>> few things. But keep these separate (at least in source
>>>>>> files) so later if they need to be pulled into one
library,
>>>>>> it will be easy.
>>>>>>
>>>>>> Cheers,
>>>>>> Sunay
>>>>>>
>>>>>>
>>>>>> Darren Reed wrote:
>>>>>>> James Carlson wrote:
>>>>>>>> Darren Reed writes:
>>>>>>>>> When I''ve had an Internet
connection via an ISP that
>>>>>>>>> meters traffic (you have a
''quota'' of x gb/month),
>>>>>>>>> I''ve run a cron job on the
end-of-month day to dump
>>>>>>>>> out the output from ipfstat and clear the
counters
>>>>>>>>> back to 0. Thus at any time during the
month, using
>>>>>>>>> ipfstat gives me a reasonable approximation
of the
>>>>>>>>> current quota use.
>>>>>>>>
>>>>>>>> I think it''d be nice if that cron job
didn''t have to run with
>>>>>>>> privileges sufficient to write to the kernel.
>>>>>>>>
>>>>>>>> I agree that the delta tool doesn''t
necessarily have to be part of
>>>>>>>> dlstat itself.  Perhaps if it were made a
common library (along
>>>>>>>> with
>>>>>>>> the rest of the column-formatting stuff that
Brussels cleaned up),
>>>>>>>> that would address most of the concerns.
>>>>>>>
>>>>>>> A C library?
>>>>>>>
>>>>>>> To me that seems like making a mountain out of a
mole hill...
>>>>>>>
>>>>>>> 15 minutes and I came up with a perl script to
print out what
>>>>>>> has changed between two dumps of netstat output -
and that''s
>>>>>>> just a very basic script.
>>>>>>>
>>>>>>> How long would it take to write in C? Substantially
longer
>>>>>>> because you''ve got to go through all sorts
of hoops to handle
>>>>>>> text input safely. For those that don''t
like perl, well, use
>>>>>>> python or your other favourite scripting langauge.
>>>>>>>
>>>>>>> Darren
>>>>>>>
>>>>>>> ./statdiff netstat st.1 st.2
>>>>>>> tcpInAckBytes   308
>>>>>>> ipInReceives    6
>>>>>>> ipOutRequests   5
>>>>>>> tcpOutDataSegs  5
>>>>>>> tcpOutDataBytes 308
>>>>>>> tcpOutSegs      5
>>>>>>> tcpRttUpdate    4
>>>>>>> tcpInInorderSegs        4
>>>>>>> ipInDelivers    6
>>>>>>> tcpInAckSegs    4
>>>>>>> tcpInSegs       6
>>>>>>> tcpInInorderBytes       208
>>>>>>>
>>>>>>>
>>>>>>> #!/bin/perl
>>>>>>>
>>>>>>> if ($#ARGV ne 2) {
>>>>>>>        print STDERR "Usaage: <format>
<file1> <file2>\n";
>>>>>>>        exit(1);
>>>>>>> }
>>>>>>>
>>>>>>> if ($ARGV[0] == "netstat") {
>>>>>>>        %fdata1 = ();
>>>>>>>        open(F, "<$ARGV[1]");
>>>>>>>        &netstatfile(\%fdata1);
>>>>>>>        close(F);
>>>>>>>        %fdata2 = ();
>>>>>>>        open(F, "<$ARGV[2]");
>>>>>>>        &netstatfile(\%fdata2);
>>>>>>>        close(F);
>>>>>>>
>>>>>>> }
>>>>>>>
>>>>>>> foreach $k (keys %fdata1) {
>>>>>>>        if ($fdata1{$k} ne $fdata2{$k}) {
>>>>>>>                print
"$k\t".int($fdata2{$k} - $fdata1{$k})."\n";
>>>>>>>        }
>>>>>>> }
>>>>>>>
>>>>>>> exit(0);
>>>>>>>
>>>>>>> sub netstatfile {
>>>>>>>        while (<F>) {
>>>>>>>                chop;
>>>>>>>                if (/^(\S+)/i) {
>>>>>>>                        s/^$1//;
>>>>>>>                }
>>>>>>>                if (/=/) {
>>>>>>>                        while (/\s+/) {
>>>>>>>                                s/\s+//;
>>>>>>>                        }
>>>>>>>                        if (/^([a-z]+)=(\d+)/i) {
>>>>>>>                                $_[0]{$1} = $2;
>>>>>>>                                s/$1=$2//;
>>>>>>>                        }
>>>>>>>                        if (/^([a-z]+)=(\d+)/i) {
>>>>>>>                                $_[0]{$1} = $2;
>>>>>>>                                s/$1=$2//;
>>>>>>>                        }
>>>>>>>                }
>>>>>>>        }
>>>>>>> }
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> crossbow-discuss mailing list
>>>>>>> crossbow-discuss at opensolaris.org
>>>>>>>
http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>

-- 
Sunay Tripathi
Distinguished Engineer
Solaris Core Operating System
Sun MicroSystems Inc.

Solaris Networking:     http://www.opensolaris.org/os/community/networking
Project Crossbow:       http://www.opensolaris.org/os/project/crossbow

Sunay Tripathi

2009-Jun-27 02:40 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Darren Reed wrote:> Sunay Tripathi wrote:
>> Darren,
>>
>> I appreciate the fact that you wrote a script in 15 min. Do
>> you expect our end users each to write a script or a program
>> and be well versed in sheel/awk to get the basic information
>> they want i.e. cumulative stats and the rate of change? I
>> think we really need to make it easy for people to get that
>> info.
> 
> Systems administrators, for whom commands like this are
> targetted, are regularly writing scripts in perl, shell, python,
> whatever.
> 
> So, yes, I do expect them to because I know they do.
> 
> Been there, done that - it''s part of the "job
description".
Huh!! Would love to meet some people who fit that description.
> While there''s an obvious need for the likes of "dlstat -i
2",
As long as we are clear on that .
> it''s building in the "diff" capability that I''m
questioning.
Sure. Lets see if we can simplify that.
> Chances are that if you''re snapshotting data that you want
> to reformat into CSV for a spreadsheet or reformat it for
> feeding into gnuplot or the snapshot data is part of some
> greater task or whatever.
And that might be the option. Just allow dumping in couple
of format etc.

Cheers,
Sunay

-- 
Sunay Tripathi
Distinguished Engineer
Solaris Core Operating System
Sun MicroSystems Inc.

Solaris Networking:     http://www.opensolaris.org/os/community/networking
Project Crossbow:       http://www.opensolaris.org/os/project/crossbow

Darren Reed

2009-Jun-27 02:41 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Sunay Tripathi wrote:> ...
> Anyway, instead of rat holing on this, can you provide a more
> concise reason why you think providing diffs or per second basis
> complicate things. Please do understand the entire goal of
> dlstat/flowstat is to help administrators see the stats in
> easy to consume manner.
Providing time interval deltas, "dlstat -i 1", "dlstat -i
30",
is completely understandable and indeed, it shows you how
change over time in an easy to understand manner.
It''s needed.

What I am questioning is doing "dlstat -d older -i 1" and
getting updates every second that show change since "older"
because your requirements (above) do not mention this.

Darren

Garrett D''Amore

2009-Jun-27 03:00 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Sunay Tripathi wrote:> Darren Reed wrote:
>> Sunay Tripathi wrote:
>>> Darren,
>>>
>>> I appreciate the fact that you wrote a script in 15 min. Do
>>> you expect our end users each to write a script or a program
>>> and be well versed in sheel/awk to get the basic information
>>> they want i.e. cumulative stats and the rate of change? I
>>> think we really need to make it easy for people to get that
>>> info.
>>
>> Systems administrators, for whom commands like this are
>> targetted, are regularly writing scripts in perl, shell, python,
>> whatever.
>>
>> So, yes, I do expect them to because I know they do.
>>
>> Been there, done that - it''s part of the "job
description".
>
> Huh!! Would love to meet some people who fit that description.
Here!  I was a UNIX admin at Qualcomm for about 4 years before I joined 
Sun.  Ran a network of some 800 Sun''s, a variety of PCs, etc.  Did most
of the tools development for the group at the same time.
>
>> While there''s an obvious need for the likes of "dlstat -i
2",
>
> As long as we are clear on that .
I don''t think anyone is contesting that.>
>> it''s building in the "diff" capability that
I''m questioning.
>
> Sure. Lets see if we can simplify that.
>
>> Chances are that if you''re snapshotting data that you want
>> to reformat into CSV for a spreadsheet or reformat it for
>> feeding into gnuplot or the snapshot data is part of some
>> greater task or whatever.
>
> And that might be the option. Just allow dumping in couple
> of format etc.
If you have a simple parseable format, then the rest is trivial.  Again, 
don''t over design this.  Space delimited columns is plenty adequate.  
The rest is trivially extractable and importable.

    - Garrett>
> Cheers,
> Sunay
>

Sunay Tripathi

2009-Jun-27 07:17 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

>>>
>>> Systems administrators, for whom commands like this are
>>> targetted, are regularly writing scripts in perl, shell, python,
>>> whatever.
>>>
>>> So, yes, I do expect them to because I know they do.
>>>
>>> Been there, done that - it''s part of the "job
description".
>>
>> Huh!! Would love to meet some people who fit that description.
> 
> Here!  I was a UNIX admin at Qualcomm for about 4 years before I joined 
> Sun.  Ran a network of some 800 Sun''s, a variety of PCs, etc.  Did
most
> of the tools development for the group at the same time.
And you think few of you (who also happen to be Solaris developers)
fit the common profile of a system administrator
of today? Or because you were once a system administrator, we
should design based on your requirement. I have been on
customer trip for last 2 months (having met over 30 customers) and
most of them don''t have people or time to do our work. Perhaps
instead of arguing with me on customer requirements, make a tour
and get your facts straight.

I apologize if I am sounding a bit hard but its preconceived notions
like this (without any facts) that makes it harder for Solaris to
penetrate new markets on new customers. Instead of trying to make
things simple and incrementally better (where we can), we end up
arguing that if I am able to it, then so should everyone else. And
the result is we constantly get slammed for being too complex and
hard to use.
>>> While there''s an obvious need for the likes of
"dlstat -i 2",
>>
>> As long as we are clear on that .
> 
> I don''t think anyone is contesting that.
>>
>>> it''s building in the "diff" capability that
I''m questioning.
And how do you think dlstat -i 2 will do the diff? Anyway, I
noted the concerns around more intricate snapshot option (and
kind of agree with that). We will look into simplifying it. If
you or Darren have any other tangible concerns, please let us
know.

Cheers,
Sunay

Garrett D''Amore

2009-Jun-27 12:56 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Sunay Tripathi wrote:>
>>>>
>>>> Systems administrators, for whom commands like this are
>>>> targetted, are regularly writing scripts in perl, shell,
python,
>>>> whatever.
>>>>
>>>> So, yes, I do expect them to because I know they do.
>>>>
>>>> Been there, done that - it''s part of the "job
description".
>>>
>>> Huh!! Would love to meet some people who fit that description.
>>
>> Here!  I was a UNIX admin at Qualcomm for about 4 years before I 
>> joined Sun.  Ran a network of some 800 Sun''s, a variety of
PCs, etc.
>> Did most of the tools development for the group at the same time.
>
> And you think few of you (who also happen to be Solaris developers)
> fit the common profile of a system administrator
> of today? Or because you were once a system administrator, we
> should design based on your requirement. I have been on
> customer trip for last 2 months (having met over 30 customers) and
> most of them don''t have people or time to do our work. Perhaps
> instead of arguing with me on customer requirements, make a tour
> and get your facts straight.
I think too many times in the past such data collection was gathered by 
talking to CEOs and CIOs, and not talking to the grunts in the field who 
really know what they want.   If that''s not what has occurred here,
then
its a refreshing change.  (I''ve seen so many tools come out of Sun
built
by what are perceived market requests, but which suck so badly just 
because either the market requests came from folks who don''t use the 
tools -- like executives -- or were interpreted by engineers at Sun who 
have no day-to-day sysadmin experience and so don''t really understand 
the requests.

Again, I''m not saying that it is what has happened here, just wanting
to
make sure it *isn''t* what happened.  You''ve already indicated
it isn''t.
>
> I apologize if I am sounding a bit hard but its preconceived notions
> like this (without any facts) that makes it harder for Solaris to
> penetrate new markets on new customers. Instead of trying to make
> things simple and incrementally better (where we can), we end up
> arguing that if I am able to it, then so should everyone else. And
> the result is we constantly get slammed for being too complex and
> hard to use.
You seem to have misunderstood what Darren and I have been saying.  
Don''t add complexity in the CLI tool that customers don''t need
or really
want.  Do add the basic functionality that customers do want -- simple 
historical tracking like netstat -i <interval> is probably a good request.

If customers want more than that, make it easy for them to build it.   
(Because each customer is going to have a different want.)

For customers that don''t want raw data, but want a nicer analytics
thing
-- that should be done with a GUI tool built on top of the underlying 
stats.  And you should still do the -i <interval> in the CLI.
>
>>>> While there''s an obvious need for the likes of
"dlstat -i 2",
>>>
>>> As long as we are clear on that .
>>
>> I don''t think anyone is contesting that.
>>>
>>>> it''s building in the "diff" capability that
I''m questioning.
>
> And how do you think dlstat -i 2 will do the diff? Anyway, I
> noted the concerns around more intricate snapshot option (and
> kind of agree with that). We will look into simplifying it. If
> you or Darren have any other tangible concerns, please let us
> know.
Its the snapshotting ability (named snapshots or whatever) that I was 
contesting.  Any admin can come up with a solution of his own in about 5 
minutes for that by just redirecting the output to a named file.

Keeping a running tally the way vmstat/netstat -i do is a reasonable 
feature plan.

I''m done on the topic now.

    - Garrett>
> Cheers,
> Sunay
>

Darren Reed

2009-Jun-27 13:21 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Garrett D''Amore wrote:> ...
>>
>> I apologize if I am sounding a bit hard but its preconceived notions
>> like this (without any facts) that makes it harder for Solaris to
>> penetrate new markets on new customers. Instead of trying to make
>> things simple and incrementally better (where we can), we end up
>> arguing that if I am able to it, then so should everyone else. And
>> the result is we constantly get slammed for being too complex and
>> hard to use.
>
> You seem to have misunderstood what Darren and I have been saying.  
> Don''t add complexity in the CLI tool that customers don''t
need or
> really want.  Do add the basic functionality that customers do want -- 
> simple historical tracking like netstat -i <interval> is probably a 
> good request.
>
> If customers want more than that, make it easy for them to build it.   
> (Because each customer is going to have a different want.)
>
> For customers that don''t want raw data, but want a nicer analytics
> thing -- that should be done with a GUI tool built on top of the 
> underlying stats.  And you should still do the -i <interval> in the
CLI.
Yes, yes and yes.

Darren

Sunay Tripathi

2009-Jun-27 18:19 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Garrett D''Amore wrote:> Sunay Tripathi wrote:
>>
>>>>>
>>>>> Systems administrators, for whom commands like this are
>>>>> targetted, are regularly writing scripts in perl, shell,
python,
>>>>> whatever.
>>>>>
>>>>> So, yes, I do expect them to because I know they do.
>>>>>
>>>>> Been there, done that - it''s part of the "job
description".
>>>>
>>>> Huh!! Would love to meet some people who fit that description.
>>>
>>> Here!  I was a UNIX admin at Qualcomm for about 4 years before I 
>>> joined Sun.  Ran a network of some 800 Sun''s, a variety of
PCs, etc.
>>> Did most of the tools development for the group at the same time.
>>
>> And you think few of you (who also happen to be Solaris developers)
>> fit the common profile of a system administrator
>> of today? Or because you were once a system administrator, we
>> should design based on your requirement. I have been on
>> customer trip for last 2 months (having met over 30 customers) and
>> most of them don''t have people or time to do our work. Perhaps
>> instead of arguing with me on customer requirements, make a tour
>> and get your facts straight.
> 
> I think too many times in the past such data collection was gathered by 
> talking to CEOs and CIOs, and not talking to the grunts in the field who 
> really know what they want.   If that''s not what has occurred
here, then
> its a refreshing change.  (I''ve seen so many tools come out of Sun
built
> by what are perceived market requests, but which suck so badly just 
> because either the market requests came from folks who don''t use
the
> tools -- like executives -- or were interpreted by engineers at Sun who 
> have no day-to-day sysadmin experience and so don''t really
understand
> the requests.
> 
> Again, I''m not saying that it is what has happened here, just
wanting to
> make sure it *isn''t* what happened.  You''ve already
indicated it isn''t.
> 
>>
>> I apologize if I am sounding a bit hard but its preconceived notions
>> like this (without any facts) that makes it harder for Solaris to
>> penetrate new markets on new customers. Instead of trying to make
>> things simple and incrementally better (where we can), we end up
>> arguing that if I am able to it, then so should everyone else. And
>> the result is we constantly get slammed for being too complex and
>> hard to use.
> 
> You seem to have misunderstood what Darren and I have been saying.  
> Don''t add complexity in the CLI tool that customers don''t
need or really
> want.  Do add the basic functionality that customers do want -- simple 
> historical tracking like netstat -i <interval> is probably a good
request.
It was a long email thread and it wasn''t clear to me if you guys
were arguing against the basic idea of diff itself or a specific
part. It became clear in the last few emails. And then usability
definition by solaris engineers as being typical made me see red.
We honestly have a long way to go to make things end user friendly
and need to try and take an incremental opportunity instead of
pinning our hopes on grand designs.

Anyway, I appreciate the help and feedback and apologies if I
sounded a bit strong. Beer on me next time I see you guys.

cheers,
Sunay

Jason King

2009-Jun-27 18:58 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

On Sat, Jun 27, 2009 at 2:17 AM, Sunay Tripathi<Sunay.Tripathi at sun.com>
wrote:>
>>>>
>>>> Systems administrators, for whom commands like this are
>>>> targetted, are regularly writing scripts in perl, shell,
python,
>>>> whatever.
>>>>
>>>> So, yes, I do expect them to because I know they do.
>>>>
>>>> Been there, done that - it''s part of the "job
description".
>>>
>>> Huh!! Would love to meet some people who fit that description.
>>
>> Here! ?I was a UNIX admin at Qualcomm for about 4 years before I joined
>> Sun. ?Ran a network of some 800 Sun''s, a variety of PCs, etc.
?Did most of
>> the tools development for the group at the same time.
>
> And you think few of you (who also happen to be Solaris developers)
> fit the common profile of a system administrator
> of today? Or because you were once a system administrator, we
> should design based on your requirement. I have been on
> customer trip for last 2 months (having met over 30 customers) and
> most of them don''t have people or time to do our work. Perhaps
> instead of arguing with me on customer requirements, make a tour
> and get your facts straight.
>
> I apologize if I am sounding a bit hard but its preconceived notions
> like this (without any facts) that makes it harder for Solaris to
> penetrate new markets on new customers. Instead of trying to make
> things simple and incrementally better (where we can), we end up
> arguing that if I am able to it, then so should everyone else. And
> the result is we constantly get slammed for being too complex and
> hard to use.
>
>>>> While there''s an obvious need for the likes of
"dlstat -i 2",
>>>
>>> As long as we are clear on that .
>>
>> I don''t think anyone is contesting that.
>>>
>>>> it''s building in the "diff" capability that
I''m questioning.
>
> And how do you think dlstat -i 2 will do the diff? Anyway, I
> noted the concerns around more intricate snapshot option (and
> kind of agree with that). We will look into simplifying it. If
> you or Darren have any other tangible concerns, please let us
> know.
As someone who is currently a sysadmin (but does a bit of development
on the side), and having worked at both large and small Sun customers,
here are the use cases I can think of off the top of my head:

1. Data collection for long(er) term storage & analysis.  Think RRDs
or such.  Essentially, provide the raw counters in an easily parseable
format.  Whatever is collecting the values can take care of any
necessary calculations (rrdtool for example understand the notions of
counters and can take care of generating interval values or rates from
the raw data).   This allows me to compare current behavior with past
behavior, do trending, etc.

2. Formatted output (for humans) for immediate troubleshooting.  Think
vmstat, mpstat, or especially nicstat (I know it''s not included, but
it should be :P).  This would   mean calculating differences over a
user-supplied interval, and ideally calculating average rates over
that interval as well.  In this case, readability is more important
than parseability, though it doesn''t mean you can''t have both
(just
not required).  It is possible with #1 to write a script to do this,
however, my experience is that a lot of places consider
''scripting'' to
be an advanced (or sometimes very advanced) skill, so I''d rather see
it included in some fashion.

I don''t think meeting these two requirements would be too complex and
I think would strike a good balance between minimal & too much.  #2 is
probably good enough for most people just wanting a tool they can look
at, while #1 would allow for those that want something more
feature-rich or complex to develop it on their own (which I suspect
would be a minority of people).

Nicolas Droux

2009-Jun-29 19:14 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

On Jun 27, 2009, at 5:56 AM, Garrett D''Amore wrote:
> Its the snapshotting ability (named snapshots or whatever) that I  
> was contesting.
The last proposal was really simple. Basically, redirect the dlstat/ 
flowstat parseable output to a file, and if such a file is specified  
at a later time through an option, it will be used as a baseline, and  
the the statistics displayed will be a diff against that baseline.  
Sounds really straightforward to me and I''m still puzzled that it  
generated such a heated debate.

Nicolas.

-- 
Nicolas Droux - Solaris Kernel Networking - Sun Microsystems, Inc.
nicolas.droux at sun.com - http://blogs.sun.com/droux

Garrett D''Amore

2009-Jun-29 20:18 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Nicolas Droux wrote:>
> On Jun 27, 2009, at 5:56 AM, Garrett D''Amore wrote:
>
>> Its the snapshotting ability (named snapshots or whatever) that I was 
>> contesting.
>
> The last proposal was really simple. Basically, redirect the 
> dlstat/flowstat parseable output to a file, and if such a file is 
> specified at a later time through an option, it will be used as a 
> baseline, and the the statistics displayed will be a diff against that 
> baseline. Sounds really straightforward to me and I''m still
puzzled
> that it generated such a heated debate.
>
> Nicolas.
>I guess I''m ok with it.  Its just that you then introduce a parser into
the code, and that further complicates things.  If its deemed as truly 
useful, then go for it.   I just prefer to avoid adding code when it 
isn''t likely to add much value.

Again, this all boils down to my KISS philosophy.

Its easier to engineer all kinds of nifty facilities, its harder to draw 
the line, once you start, as to where you stop.

For example, should every statistic tool have such a facility?  Would 
the snapshot file be portable across boots?  What kind of validation 
would you perform, if any, to catch the case where a statistic is reset 
(e.g. to a reboot, or to some other administrative change)?  What 
happens if the tool changes, so that the content changes?

Al of these questions (which might not be terribly hard to answer, but 
nonetheless need one) can simply be avoided if one just assumes that the 
end user will deal with this himself using whatever typical tools they 
are already using for statistic tracking elsewhere...

    - Garrett

Robert Milkowski

2009-Jun-30 07:11 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Garrett D''Amore wrote:>  Sunay Tripathi wrote:
>> And you would expect a user trying to see his usage/stat change on a
per
>> second basis under load to keep doing this? This is pretty much what
>> we get asked by every customer. Observing change as workload varies is
>> important to people. I agree with not trying to over engineer but do
>> think in terms of usability. Listening to your non engineer users once
>> in a while is not such a bad thing :)
>
> So build a full-on graphical analysis tool like the amazing stuff they 
> are doing with DTrace for the storage products.  Trying to band-aid 
> around a problem than any junior sysadmin (and most end users) can 
> script around in five minutes is not really solving anything.
>
> Keep the underlying tools simple.  Build on top of them whatever 
> analysis you want.Even with dtrace you have built-in aggregations and nice visualizations 
from a command line - and believe me it is very helpful
not to have to parse every dtrace output thru perl/awk/shell.

Same goes for other basic observability in a system - be it vmstat, 
mpstat, iostat, etc...
I hope you are not arguing of getting rid of these tools as user can 
quickly write a script to get the same functionality by using kstat.
Having tools like Analytics integrated into OS and expanded for even 
greater observability would be great to see but at the same time
providing good cli tools for observability and analysis should also be a 
goas as an ability to just get raw numbers and let user do whatever he
wants with them. I believe there is a great need for all of these 
approaches.

While I haven''t followed the discussion from the beginning and in
detail
and maybe whatever is proposed here is overarchitected or maybe not,
I would definitely love to be able to get easily all basic network 
relate data out-of-the-box without having to write any scripts. There is 
a reason
why sys admins are using scripts like nicstat.pl 
(http://blogs.sun.com/timc/entry/nicstat_the_solaris_network_monitoring) 
but it should really be part of the OS.

Sorry if I misunderstood something - I will try to read entire thread 
more closely later.

-- 
Robert Milkowski
http://milek.blogspot.com

Robert Milkowski

2009-Jun-30 07:22 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Garrett D''Amore wrote:>
> You seem to have misunderstood what Darren and I have been saying.  
> Don''t add complexity in the CLI tool that customers don''t
need or
> really want.  Do add the basic functionality that customers do want -- 
> simple historical tracking like netstat -i <interval> is probably a 
> good request.
>
> If customers want more than that, make it easy for them to build it.   
> (Because each customer is going to have a different want.)
>
> For customers that don''t want raw data, but want a nicer analytics
> thing -- that should be done with a GUI tool built on top of the 
> underlying stats.  And you should still do the -i <interval> in the
CLI.I believe that there is also a great need for tools in between raw-data 
and a full blown GUIs like Fishworks.
That''s why people are using 3rd party tools in Solaris like iftop, 
nicstat.pl, iotop, etc.
> Its the snapshotting ability (named snapshots or whatever) that I was 
> contesting.  Any admin can come up with a solution of his own in about 
> 5 minutes for that by just redirecting the output to a named file.
>And you may be right here with snapshotting...

Garrett D''Amore

2009-Jun-30 07:33 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Robert Milkowski wrote:> Garrett D''Amore wrote:
>>
>> You seem to have misunderstood what Darren and I have been saying.  
>> Don''t add complexity in the CLI tool that customers
don''t need or
>> really want.  Do add the basic functionality that customers do want 
>> -- simple historical tracking like netstat -i <interval> is
probably
>> a good request.
>>
>> If customers want more than that, make it easy for them to build 
>> it.   (Because each customer is going to have a different want.)
>>
>> For customers that don''t want raw data, but want a nicer
analytics
>> thing -- that should be done with a GUI tool built on top of the 
>> underlying stats.  And you should still do the -i <interval> in
the CLI.
> I believe that there is also a great need for tools in between 
> raw-data and a full blown GUIs like Fishworks.
> That''s why people are using 3rd party tools in Solaris like iftop,
> nicstat.pl, iotop, etc.
Obviously I don''t think folks should have to grunge through kstat(1M) 
output -- if we can provide the information that 90% of the people need 
90% of the time in a simple CLI, hey, that sounds like something we darn 
well ought to do.

Its the incremental changes beyond that 90% that concern me.  (And yes, 
I''m pulling these meaningless percentages out of my nether regions, but
the point is still valid I think.)  If you have add 20% to the code 
size, add architectural questions like "file stability" add a parser, 
etc, and it only really solves a problem that 5% of the users of these 
features need about 5% of the time, and which those 5% could easily 
build themselves in less than 10 minutes of coding with a trivial bit of 
shell script, then that sounds like a poor trade off to me.

So, the "-i" running increments are definitely a tool that I think hit
a
large class of users a lot of the time,  and which will satisfy a huge 
number of customers.  Probably nearly all the users would be happy with 
just that.

The snapshot facility I suspect probably has a much lower threshold.  
The really big paying customers already know how to solve that problem 
(given the existence of a -i parseable format) themselves in the 5 
minutes its going to take -- so I doubt that that particular feature has 
shown up on any real product requirements.  (It might have been a low 
priority RFE mentioned... that I''d be inclined to believe....)

I don''t really want to discuss this further... I think the team 
understands my reservations, and I''m confident that while
they''ll make
their own decision on the matter, they''ve at least considered my
opinion
on it.

    -- Garrett

Darren Reed

2009-Jun-30 14:08 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Nicolas Droux wrote:>
> On Jun 27, 2009, at 5:56 AM, Garrett D''Amore wrote:
>
>> Its the snapshotting ability (named snapshots or whatever) that I was 
>> contesting.
>
> The last proposal was really simple. Basically, redirect the 
> dlstat/flowstat parseable output to a file, and if such a file is 
> specified at a later time through an option, it will be used as a 
> baseline, and the the statistics displayed will be a diff against that 
> baseline. Sounds really straightforward to me and I''m still
puzzled
> that it generated such a heated debate.
The reason I''m against it is that comparing statistical
data from one instantiation to the next instatiation of
a command is better left to tools dedicated to performing
that kind of task.

Why?

Becuase it then provides you with a good place to present
that data in lots of different ways without adding lots
of functionality that isn''t directly related to retrieving
and displaying that information to dlstat/flowstat.

For example, if the tool to do statistical differences is
written well, it should be able to easily work on evey
command that produces "parseable" statistics. Thus ipmpstat
benefits, along with others, without needing to modify them.

As I demonstrated, using a programming language that is built
to handle text makes it trivial to write a program that is
purpose built to compare statistical output. And such programs
are considerably easier to make "secure" when dealing with
text.

Also, I haven''t seen a functional requirement statement from
a customer that requires it be built in to such commands.

The only functionality that I would see hard to reproduce
externally would be "dlstat -d older -i 2". But then what
would I use that output for? On a sufficiently busy system,
the numbers output every second are most likely going to be
too busy for me to do the maths in my head to calculate
accurate differences, so delta information can''t be it.
All that I can imagine it being used for is to find out when
a particular statistic reaches more than X from a given point
in time. Given I''m interested in that, how easy is it to then
use that information? A solid use case would be helpful in
understanding why this needs to be built in.

Darren

Robert Milkowski

2009-Jun-30 22:20 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Garrett D''Amore wrote:>
> I don''t really want to discuss this further... I think the team 
> understands my reservations, and I''m confident that while
they''ll make
> their own decision on the matter, they''ve at least considered my 
> opinion on it.
>
I wrote my email before I read entire thread - my fault, sorry.
Looks like there was a misundertanding and you meant something else than 
I thought you did. After couple of other emails it was clear that 
basically everyone in this discussion is more or less on the same page :)


-- 
Robert Milkowski
http://milek.blogspot.com

Nicolas Droux

2009-Jun-30 22:57 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

Darren,

There''s some latitude between not supporting any baseline, and a full 
fledge tool which supports statistics from every subsystem. Having a 
single option which can be used to specify a baseline is a 
straightforward incremental extension which doesn''t prevent other more 
sophisticated tools to be built in the future. It also allows the user 
to leverage the various formatting and options offered by dladm to 
zero-in on a particular type of data when analyzing a problem (e.g. per 
data-link, software/hardware rings, packet scheduling information, etc.)

Also to clarify, using the -i <interval> option will still print the 
delta between the samples. If a baseline is specified, the initial value 
of the stats will show the delta between the baseline and the current 
statistics instead of the current statistics.

Nicolas.

Darren Reed wrote:> Nicolas Droux wrote:
>>
>> On Jun 27, 2009, at 5:56 AM, Garrett D''Amore wrote:
>>
>>> Its the snapshotting ability (named snapshots or whatever) that I
was
>>> contesting.
>>
>> The last proposal was really simple. Basically, redirect the 
>> dlstat/flowstat parseable output to a file, and if such a file is 
>> specified at a later time through an option, it will be used as a 
>> baseline, and the the statistics displayed will be a diff against that 
>> baseline. Sounds really straightforward to me and I''m still
puzzled
>> that it generated such a heated debate.
> 
> The reason I''m against it is that comparing statistical
> data from one instantiation to the next instatiation of
> a command is better left to tools dedicated to performing
> that kind of task.
> 
> Why?
> 
> Becuase it then provides you with a good place to present
> that data in lots of different ways without adding lots
> of functionality that isn''t directly related to retrieving
> and displaying that information to dlstat/flowstat.
> 
> For example, if the tool to do statistical differences is
> written well, it should be able to easily work on evey
> command that produces "parseable" statistics. Thus ipmpstat
> benefits, along with others, without needing to modify them.
> 
> As I demonstrated, using a programming language that is built
> to handle text makes it trivial to write a program that is
> purpose built to compare statistical output. And such programs
> are considerably easier to make "secure" when dealing with
> text.
> 
> Also, I haven''t seen a functional requirement statement from
> a customer that requires it be built in to such commands.
> 
> The only functionality that I would see hard to reproduce
> externally would be "dlstat -d older -i 2". But then what
> would I use that output for? On a sufficiently busy system,
> the numbers output every second are most likely going to be
> too busy for me to do the maths in my head to calculate
> accurate differences, so delta information can''t be it.
> All that I can imagine it being used for is to find out when
> a particular statistic reaches more than X from a given point
> in time. Given I''m interested in that, how easy is it to then
> use that information? A solid use case would be helpful in
> understanding why this needs to be built in.
> 
> Darren
>

Darren Reed

2009-Jul-06 05:51 UTC

head link

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

On 06/30/09 15:57, Nicolas Droux wrote:> Darren,
>
> There''s some latitude between not supporting any baseline, and a
full
> fledge tool which supports statistics from every subsystem. Having a 
> single option which can be used to specify a baseline is a 
> straightforward incremental extension which doesn''t prevent other
more
> sophisticated tools to be built in the future. It also allows the user 
> to leverage the various formatting and options offered by dladm to 
> zero-in on a particular type of data when analyzing a problem (e.g. 
> per data-link, software/hardware rings, packet scheduling information, 
> etc.)
>
> Also to clarify, using the -i <interval> option will still print the 
> delta between the samples. If a baseline is specified, the initial 
> value of the stats will show the delta between the baseline and the 
> current statistics instead of the current statistics.
If the "-d" option is going to be introduced, given that it is new
behaviour for any tool on Solaris, it would be well worth the effort
to go into some detail about how it interacts with all of the other
options available for dlstat/flowstat.

That said, I''m still getting the feeling that this is a solution
looking
for a problem but if others are happy...

Darren

crossbow discuss - Jun 2009 - inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] dlstat code changes: early draft

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat

[crossbow-discuss] inception review summary of PSARC/2009/364 - dlstat and flowstat