Brendan Gregg - Sun Microsystems
2007-Jan-24 02:02 UTC
[crossbow-discuss] Network stats with netstat/nicstat
G''Day Folks,
I''ve just joined this mailing list and have been catching up on the
recent thread about a network utilization stats addition to netstat.
This sounds like an oppurtunity to get nicstat, or the look and feel
of nicstat, into Solaris. This is nicstat:
$ nicstat 1
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
01:10:00 nge1 0.00 0.00 0.00 0.00 0.00 346.0 0.00 0.00
01:10:00 nge0 11.24 22.01 22.51 28.97 511.6 778.0 0.03 0.00
01:10:00 lo0 0.00 0.00 3.42 3.42 0.00 0.00 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
01:10:01 nge1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
01:10:01 nge0 2437.3 583.9 20546.3 10300.5 121.5 58.05 2.47 0.00
01:10:01 lo0 0.00 0.00 1.99 1.99 0.00 0.00 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
01:10:02 nge1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
01:10:02 nge0 2322.5 569.1 20057.3 10038.5 118.6 58.06 2.37 0.00
01:10:02 lo0 0.00 0.00 1.98 1.98 0.00 0.00 0.00 0.00
^C
$ nicstat -h
USAGE: nicstat [-hsz] [-i int[,int...]] | [interval [count]]
-h # help
-i interface # track interface only
-n # show non-local interfaces only (exclude lo0)
-s # summary output
-z # skip zero value lines
eg,
nicstat # print summary since boot only
nicstat 1 # print every 1 second
nicstat 1 5 # print 5 times only
nicstat -z 1 # print every 1 second, skip zero lines
nicstat -i hme0 1 # print hme0 only every 1 second
I wrote nicstat some years ago to fetch utilization on systems without
the SE Toolkit installed; also to emphasise the notion of performance
resource monitoring by utilization and by saturation, and as part of
an open source toolkit of kstat based programs. Both Tim Cook from
Sun PAE and myself currently develop the nicstat source.
nicstat has been used for years by customers in production, and has
had many subtle tweaks based on this experience (and a few more on
the todo list). Recently nicstat and its source code were published in
Solaris Performance and Tools, familiarizing more customers with
its form of output.
I (and many others) would like to see nicstat, or nicstat''s style of
output, to be included in Solaris, either as a command or as a switch
to netstat. I don''t see why this can''t be its own tool -
netstat is
already a kitchen sink of commands.
Having a prstat-style output would be handy in some situations, especially
due to project Crossbow, and could be a switch to nicstat. It is not what
is wanted as the default output from a performance tool - customers
like the default mpstat-style output of nicstat, and the ability to match on
specified interfaces (-i).
Recent versions of nicstat are here:
http://www.brendangregg.com/K9Toolkit/nicstat.c # C
http://www.brendangregg.com/K9Toolkit/nicstat # Perl
Tim and I were working on some updates...
Mike, please take a look at the nicstat code; if you can add the
features that you think are useful (a prstat-style output as a switch),
Tim and I can add our final updates - and we should have an awsome
tool that pleases everyone.
cheers,
Brendan
--
Brendan
[CA, USA]
Sorry if this is not the proper place to post...
I just discovered nicstat yesterday! We have a mix of Sun and Fujitsu Prime
Powers... I added the Fujitsu Gigabit interfaces to nicstat and things look like
they are running well:
Time Int rKb/s wKb/s rPk/s wPk/s rAvs wAvs %Util Sat
16:25:48 fjge2 0.24 0.92 2.55 2.85 95.69 329.54 0.00 0.00
16:25:48 fjge0 9.93 1.62 15.80 6.18 643.68 269.02 0.01 0.00
16:25:48 fjge3 0.11 0.13 1.05 0.80 105.76 168.83 0.00 0.00
16:25:48 fjge1 49889.96 1179.98 34305.80 17224.20 1489.17 70.15 41.84
0.00
but then I eventually when the system really gets busy:
Time Int rKb/s wKb/s rPk/s wPk/s rAvs wAvs %Util Sat
16:30:48 fjge2 0.58 1.25 5.75 6.52 103.94 195.72 0.00 0.00
16:30:48 fjge0 0.86 0.20 6.70 0.93 131.13 218.88 0.00 0.00
16:30:48 fjge3 300239975113690.69 268.09 17742.80 3918.23 17327915239782.86
70.06 100.00 0.00
16:30:48 fjge1 300239975147902.75 1414.54 41165.58 20648.68 7468513978338.35
70.15 100.00 0.00
things appear to get a bit off... any idea?
Thanks!
john
again, i apologize if this is the wrong area to post this...
This message posted from opensolaris.org
Brendan Gregg - Sun Microsystems
2007-Jan-24 19:16 UTC
[crossbow-discuss] Re: Network stats with netstat/nicstat
G''Day John, On Wed, Jan 24, 2007 at 10:36:43AM -0800, John wrote:> Sorry if this is not the proper place to post... > > I just discovered nicstat yesterday! We have a mix of Sun and Fujitsu Prime Powers... I added the Fujitsu Gigabit interfaces to nicstat and things look like they are running well: > > Time Int rKb/s wKb/s rPk/s wPk/s rAvs wAvs %Util Sat > 16:25:48 fjge2 0.24 0.92 2.55 2.85 95.69 329.54 0.00 0.00 > 16:25:48 fjge0 9.93 1.62 15.80 6.18 643.68 269.02 0.01 0.00 > 16:25:48 fjge3 0.11 0.13 1.05 0.80 105.76 168.83 0.00 0.00 > 16:25:48 fjge1 49889.96 1179.98 34305.80 17224.20 1489.17 70.15 41.84 0.00 > > but then I eventually when the system really gets busy: > > Time Int rKb/s wKb/s rPk/s wPk/s rAvs wAvs %Util Sat > 16:30:48 fjge2 0.58 1.25 5.75 6.52 103.94 195.72 0.00 0.00 > 16:30:48 fjge0 0.86 0.20 6.70 0.93 131.13 218.88 0.00 0.00 > 16:30:48 fjge3 300239975113690.69 268.09 17742.80 3918.23 17327915239782.86 70.06 100.00 0.00 > 16:30:48 fjge1 300239975147902.75 1414.54 41165.58 20648.68 7468513978338.35 70.15 100.00 0.00You are the first person to ever experience this! Firstly, try the Perl version - as it''s output format has been improved for large traffic.* There is also a "-s" output style for extreme loads, which will also improve the output. The actual bug may be a combination of fjge only exporting 32-bit stats, and a heavy workload overflowing these stats. Please email me the output from "kstat -n fge1" if you can. cheers, Brendan --- * In particular, the following algorithm made the output columns much neater: # print_neat - print a float with decimal places if appropriate. # # This specifically keeps the width to 7 characters, if possible, plus # a trailing space. # sub print_neat { my $num = shift; if ($num >= 100000) { printf "%7d ", $num; } elsif ($num >= 100) { printf "%7.1f ", $num; } else { printf "%7.2f ", $num; } } We were planning on dropping whitespace to retain column alignment in the same way as vmstat does, at which point the output will be quite robust.
Brendan, FYI, I have posted my latest version of nicstat.c, including binaries, to my blog - http://blogs.sun.com/timc/ This version does not need to be modified to add a string to match a Fujitsu-specific network driver, for example. Regards, Tim This message posted from opensolaris.org
Sweet! Anyway to get this integrated into opensolaris?!?! Thanks, - Ryan -- UNIX Administrator http://prefetch.net On 2/14/07, Tim Cook <Tim.Cook at sun.com> wrote:> Brendan, > > FYI, I have posted my latest version of nicstat.c, including binaries, to my blog - http://blogs.sun.com/timc/ > > This version does not need to be modified to add a string to match a Fujitsu-specific network driver, for example. > > Regards, > Tim > > > This message posted from opensolaris.org > _______________________________________________ > crossbow-discuss mailing list > crossbow-discuss at opensolaris.org > http://opensolaris.org/mailman/listinfo/crossbow-discuss >
> Brendan, > > FYI, I have posted my latest version of nicstat.c, > including binaries, to my blog - > http://blogs.sun.com/timc/ > > This version does not need to be modified to add a > string to match a Fujitsu-specific network driver, > for example. > > Regards, > TimGreat and thanks!! /tmp >./nicstat 5 Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat 06:24:52 lo0 0.00 0.00 3.46 3.46 0.00 0.00 0.00 0.00 06:24:52 fjge0 0.32 6.76 7139.7 2082.4 0.05 3.32 0.01 0.00 06:24:52 fjge1 5.58 2.25 2260.8 152.4 2.53 15.14 0.01 0.00 06:24:52 fjge2 4.22 3.51 1829.1 2803.4 2.36 1.28 0.01 0.00 06:24:52 fjge3 3.71 0.67 503.4 217.7 7.55 3.17 0.00 0.00 06:24:52 hme0 0.09 0.00 0.84 0.00 112.4 45.98 0.00 0.00 Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat 06:24:57 lo0 0.00 0.00 6.60 6.60 0.00 0.00 0.00 0.00 06:24:57 fjge0 5.33 3.44 21.80 13.00 250.5 271.3 0.01 0.00 06:24:57 fjge1 12058.3 171.2 8374.2 2221.6 1474.5 78.92 10.0 0.00 06:24:57 fjge2 13958.6 330.0 9600.6 4813.6 1488.8 70.19 11.7 0.00 06:24:57 fjge3 0.05 0.03 0.80 0.40 68.00 74.00 0.00 0.00 06:24:57 hme0 0.02 0.00 0.40 0.00 60.00 0.00 0.00 0.00 This message posted from opensolaris.org
> Brendan, > > FYI, I have posted my latest version of nicstat.c, > including binaries, to my blog - > http://blogs.sun.com/timc/ > > This version does not need to be modified to add a > string to match a Fujitsu-specific network driver, > for example. > > Regards, > TimTim, Thanks for adding in the fj interfaces but... seems i still have an issue: Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat 06:58:01 lo0 0.00 0.00 2.40 2.40 0.00 0.00 0.00 0.00 06:58:01 fjge0 0.40 0.15 6.80 2.40 60.00 64.00 0.00 0.00 06:58:01 fjge1 3602879701067720 224.4 7115.8 3564.2 518472808945353 64.48 100 0.00 06:58:01 fjge2 4015.2 95.12 2770.8 1390.2 1483.9 70.06 3.37 0.00 06:58:01 fjge3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 06:58:01 hme0 0.08 0.00 0.80 0.00 105.7 0.00 0.00 0.00 This message posted from opensolaris.org
scrming,
Can you send me the output of the following command:
kstat -Td -pm fjge 5 2
Looks like the issue Brendan previously mentioned may be a factor. Apologies if
you already sent kstat output to Brendan.
Regards,
Tim
This message posted from opensolaris.org
Tim Cook
2007-Feb-15 18:25 UTC
[crossbow-discuss] Re: Re: Network stats with netstat/nicstat
Matty, All I want to say is, "watch this space". Brendan & I will be seeing what we can do, but the functionality may appear somewhere else, or have a different name... Regards, Tim> Sweet! Anyway to get this integrated into > opensolaris?!?! > > Thanks, > - Ryan > -- > UNIX Administrator > http://prefetch.netThis message posted from opensolaris.org
> scrming, > > Can you send me the output of the following command: > > kstat -Td -pm fjge 5 2 > oks like the issue Brendan previously mentioned may > be a factor. Apologies if you already sent kstat > output to Brendan. > > Regards, > TimOops... I forgot about the e-mail i got from Brendan... G''Day John,> > > >Thanks for this - it confirmed my suspicions, fjge doesn''t support 64-bit > >stats properly, and so it''s 32-bit stats will sometimes overflow and > >produce wonky numbers. I don''t think I''ll be able to fix this in > >nicstat - I may have to hunt down the author of fjge to get it fixed.. :) > > > >cheers, > > > >BrendanLet me know if you guys would like me to involve our Fujitsu support... This message posted from opensolaris.org