Brendan Gregg - Sun Microsystems
2007-Jan-24 02:02 UTC
[crossbow-discuss] Network stats with netstat/nicstat
G''Day Folks, I''ve just joined this mailing list and have been catching up on the recent thread about a network utilization stats addition to netstat. This sounds like an oppurtunity to get nicstat, or the look and feel of nicstat, into Solaris. This is nicstat: $ nicstat 1 Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat 01:10:00 nge1 0.00 0.00 0.00 0.00 0.00 346.0 0.00 0.00 01:10:00 nge0 11.24 22.01 22.51 28.97 511.6 778.0 0.03 0.00 01:10:00 lo0 0.00 0.00 3.42 3.42 0.00 0.00 0.00 0.00 Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat 01:10:01 nge1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 01:10:01 nge0 2437.3 583.9 20546.3 10300.5 121.5 58.05 2.47 0.00 01:10:01 lo0 0.00 0.00 1.99 1.99 0.00 0.00 0.00 0.00 Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat 01:10:02 nge1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 01:10:02 nge0 2322.5 569.1 20057.3 10038.5 118.6 58.06 2.37 0.00 01:10:02 lo0 0.00 0.00 1.98 1.98 0.00 0.00 0.00 0.00 ^C $ nicstat -h USAGE: nicstat [-hsz] [-i int[,int...]] | [interval [count]] -h # help -i interface # track interface only -n # show non-local interfaces only (exclude lo0) -s # summary output -z # skip zero value lines eg, nicstat # print summary since boot only nicstat 1 # print every 1 second nicstat 1 5 # print 5 times only nicstat -z 1 # print every 1 second, skip zero lines nicstat -i hme0 1 # print hme0 only every 1 second I wrote nicstat some years ago to fetch utilization on systems without the SE Toolkit installed; also to emphasise the notion of performance resource monitoring by utilization and by saturation, and as part of an open source toolkit of kstat based programs. Both Tim Cook from Sun PAE and myself currently develop the nicstat source. nicstat has been used for years by customers in production, and has had many subtle tweaks based on this experience (and a few more on the todo list). Recently nicstat and its source code were published in Solaris Performance and Tools, familiarizing more customers with its form of output. I (and many others) would like to see nicstat, or nicstat''s style of output, to be included in Solaris, either as a command or as a switch to netstat. I don''t see why this can''t be its own tool - netstat is already a kitchen sink of commands. Having a prstat-style output would be handy in some situations, especially due to project Crossbow, and could be a switch to nicstat. It is not what is wanted as the default output from a performance tool - customers like the default mpstat-style output of nicstat, and the ability to match on specified interfaces (-i). Recent versions of nicstat are here: http://www.brendangregg.com/K9Toolkit/nicstat.c # C http://www.brendangregg.com/K9Toolkit/nicstat # Perl Tim and I were working on some updates... Mike, please take a look at the nicstat code; if you can add the features that you think are useful (a prstat-style output as a switch), Tim and I can add our final updates - and we should have an awsome tool that pleases everyone. cheers, Brendan -- Brendan [CA, USA]
Sorry if this is not the proper place to post... I just discovered nicstat yesterday! We have a mix of Sun and Fujitsu Prime Powers... I added the Fujitsu Gigabit interfaces to nicstat and things look like they are running well: Time Int rKb/s wKb/s rPk/s wPk/s rAvs wAvs %Util Sat 16:25:48 fjge2 0.24 0.92 2.55 2.85 95.69 329.54 0.00 0.00 16:25:48 fjge0 9.93 1.62 15.80 6.18 643.68 269.02 0.01 0.00 16:25:48 fjge3 0.11 0.13 1.05 0.80 105.76 168.83 0.00 0.00 16:25:48 fjge1 49889.96 1179.98 34305.80 17224.20 1489.17 70.15 41.84 0.00 but then I eventually when the system really gets busy: Time Int rKb/s wKb/s rPk/s wPk/s rAvs wAvs %Util Sat 16:30:48 fjge2 0.58 1.25 5.75 6.52 103.94 195.72 0.00 0.00 16:30:48 fjge0 0.86 0.20 6.70 0.93 131.13 218.88 0.00 0.00 16:30:48 fjge3 300239975113690.69 268.09 17742.80 3918.23 17327915239782.86 70.06 100.00 0.00 16:30:48 fjge1 300239975147902.75 1414.54 41165.58 20648.68 7468513978338.35 70.15 100.00 0.00 things appear to get a bit off... any idea? Thanks! john again, i apologize if this is the wrong area to post this... This message posted from opensolaris.org
Brendan Gregg - Sun Microsystems
2007-Jan-24 19:16 UTC
[crossbow-discuss] Re: Network stats with netstat/nicstat
G''Day John, On Wed, Jan 24, 2007 at 10:36:43AM -0800, John wrote:> Sorry if this is not the proper place to post... > > I just discovered nicstat yesterday! We have a mix of Sun and Fujitsu Prime Powers... I added the Fujitsu Gigabit interfaces to nicstat and things look like they are running well: > > Time Int rKb/s wKb/s rPk/s wPk/s rAvs wAvs %Util Sat > 16:25:48 fjge2 0.24 0.92 2.55 2.85 95.69 329.54 0.00 0.00 > 16:25:48 fjge0 9.93 1.62 15.80 6.18 643.68 269.02 0.01 0.00 > 16:25:48 fjge3 0.11 0.13 1.05 0.80 105.76 168.83 0.00 0.00 > 16:25:48 fjge1 49889.96 1179.98 34305.80 17224.20 1489.17 70.15 41.84 0.00 > > but then I eventually when the system really gets busy: > > Time Int rKb/s wKb/s rPk/s wPk/s rAvs wAvs %Util Sat > 16:30:48 fjge2 0.58 1.25 5.75 6.52 103.94 195.72 0.00 0.00 > 16:30:48 fjge0 0.86 0.20 6.70 0.93 131.13 218.88 0.00 0.00 > 16:30:48 fjge3 300239975113690.69 268.09 17742.80 3918.23 17327915239782.86 70.06 100.00 0.00 > 16:30:48 fjge1 300239975147902.75 1414.54 41165.58 20648.68 7468513978338.35 70.15 100.00 0.00You are the first person to ever experience this! Firstly, try the Perl version - as it''s output format has been improved for large traffic.* There is also a "-s" output style for extreme loads, which will also improve the output. The actual bug may be a combination of fjge only exporting 32-bit stats, and a heavy workload overflowing these stats. Please email me the output from "kstat -n fge1" if you can. cheers, Brendan --- * In particular, the following algorithm made the output columns much neater: # print_neat - print a float with decimal places if appropriate. # # This specifically keeps the width to 7 characters, if possible, plus # a trailing space. # sub print_neat { my $num = shift; if ($num >= 100000) { printf "%7d ", $num; } elsif ($num >= 100) { printf "%7.1f ", $num; } else { printf "%7.2f ", $num; } } We were planning on dropping whitespace to retain column alignment in the same way as vmstat does, at which point the output will be quite robust.
Brendan, FYI, I have posted my latest version of nicstat.c, including binaries, to my blog - http://blogs.sun.com/timc/ This version does not need to be modified to add a string to match a Fujitsu-specific network driver, for example. Regards, Tim This message posted from opensolaris.org
Sweet! Anyway to get this integrated into opensolaris?!?! Thanks, - Ryan -- UNIX Administrator http://prefetch.net On 2/14/07, Tim Cook <Tim.Cook at sun.com> wrote:> Brendan, > > FYI, I have posted my latest version of nicstat.c, including binaries, to my blog - http://blogs.sun.com/timc/ > > This version does not need to be modified to add a string to match a Fujitsu-specific network driver, for example. > > Regards, > Tim > > > This message posted from opensolaris.org > _______________________________________________ > crossbow-discuss mailing list > crossbow-discuss at opensolaris.org > http://opensolaris.org/mailman/listinfo/crossbow-discuss >
> Brendan, > > FYI, I have posted my latest version of nicstat.c, > including binaries, to my blog - > http://blogs.sun.com/timc/ > > This version does not need to be modified to add a > string to match a Fujitsu-specific network driver, > for example. > > Regards, > TimGreat and thanks!! /tmp >./nicstat 5 Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat 06:24:52 lo0 0.00 0.00 3.46 3.46 0.00 0.00 0.00 0.00 06:24:52 fjge0 0.32 6.76 7139.7 2082.4 0.05 3.32 0.01 0.00 06:24:52 fjge1 5.58 2.25 2260.8 152.4 2.53 15.14 0.01 0.00 06:24:52 fjge2 4.22 3.51 1829.1 2803.4 2.36 1.28 0.01 0.00 06:24:52 fjge3 3.71 0.67 503.4 217.7 7.55 3.17 0.00 0.00 06:24:52 hme0 0.09 0.00 0.84 0.00 112.4 45.98 0.00 0.00 Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat 06:24:57 lo0 0.00 0.00 6.60 6.60 0.00 0.00 0.00 0.00 06:24:57 fjge0 5.33 3.44 21.80 13.00 250.5 271.3 0.01 0.00 06:24:57 fjge1 12058.3 171.2 8374.2 2221.6 1474.5 78.92 10.0 0.00 06:24:57 fjge2 13958.6 330.0 9600.6 4813.6 1488.8 70.19 11.7 0.00 06:24:57 fjge3 0.05 0.03 0.80 0.40 68.00 74.00 0.00 0.00 06:24:57 hme0 0.02 0.00 0.40 0.00 60.00 0.00 0.00 0.00 This message posted from opensolaris.org
> Brendan, > > FYI, I have posted my latest version of nicstat.c, > including binaries, to my blog - > http://blogs.sun.com/timc/ > > This version does not need to be modified to add a > string to match a Fujitsu-specific network driver, > for example. > > Regards, > TimTim, Thanks for adding in the fj interfaces but... seems i still have an issue: Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat 06:58:01 lo0 0.00 0.00 2.40 2.40 0.00 0.00 0.00 0.00 06:58:01 fjge0 0.40 0.15 6.80 2.40 60.00 64.00 0.00 0.00 06:58:01 fjge1 3602879701067720 224.4 7115.8 3564.2 518472808945353 64.48 100 0.00 06:58:01 fjge2 4015.2 95.12 2770.8 1390.2 1483.9 70.06 3.37 0.00 06:58:01 fjge3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 06:58:01 hme0 0.08 0.00 0.80 0.00 105.7 0.00 0.00 0.00 This message posted from opensolaris.org
scrming, Can you send me the output of the following command: kstat -Td -pm fjge 5 2 Looks like the issue Brendan previously mentioned may be a factor. Apologies if you already sent kstat output to Brendan. Regards, Tim This message posted from opensolaris.org
Tim Cook
2007-Feb-15 18:25 UTC
[crossbow-discuss] Re: Re: Network stats with netstat/nicstat
Matty, All I want to say is, "watch this space". Brendan & I will be seeing what we can do, but the functionality may appear somewhere else, or have a different name... Regards, Tim> Sweet! Anyway to get this integrated into > opensolaris?!?! > > Thanks, > - Ryan > -- > UNIX Administrator > http://prefetch.netThis message posted from opensolaris.org
> scrming, > > Can you send me the output of the following command: > > kstat -Td -pm fjge 5 2 > oks like the issue Brendan previously mentioned may > be a factor. Apologies if you already sent kstat > output to Brendan. > > Regards, > TimOops... I forgot about the e-mail i got from Brendan... G''Day John,> > > >Thanks for this - it confirmed my suspicions, fjge doesn''t support 64-bit > >stats properly, and so it''s 32-bit stats will sometimes overflow and > >produce wonky numbers. I don''t think I''ll be able to fix this in > >nicstat - I may have to hunt down the author of fjge to get it fixed.. :) > > > >cheers, > > > >BrendanLet me know if you guys would like me to involve our Fujitsu support... This message posted from opensolaris.org