Hello,
I use Brendan''s sysperfstat script to see the overall system
performance and
found the the disk utilization is over 100:
15:51:38 14.52 15.01 200.00 24.42 0.00 0.00 83.53 0.00
15:51:42 11.37 15.01 200.00 25.48 0.00 0.00 88.43 0.00
------ Utilisation ------ ------ Saturation ------
Time %CPU %Mem %Disk %Net CPU Mem Disk Net
15:51:45 11.01 15.01 200.00 12.02 0.00 0.00 95.03 0.00
15:51:48 13.80 15.01 200.00 24.87 0.00 0.00 98.86 0.00
15:51:51 9.44 15.01 200.00 17.02 0.00 0.00 102.64 0.00
15:51:54 9.49 15.01 164.59 9.10 0.00 0.00 83.75 0.00
15:51:57 16.58 15.01 2.83 20.46 0.00 0.00 0.00 0.00
how can I fix this ? is there new verion of this script ?
my system is X4600-M1 with hardware RAID of
0+1 = OS disk =72 GB = d0
2+3 = apps data disk = 146 GB = d2, SVM soft partition with one UFS
file system is active
at that time, iostat showed strange output:
cpu
us sy wt id
13 9 0 78
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30
0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40
0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52
0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0
cpu
us sy wt id
10 5 0 85
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30
0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40
0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52
0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0
kr/s & kw/s show 0, but wait is 334,
at this, the application always hang.
# dtrace -n ''io:::start { @files[pid, execname,
args[2]->fi_pathname] sum(args[0]->b_bcount); } tick-5sec { exit();
}''
dtrace: description ''io:::start '' matched 7 probes
CPU ID FUNCTION:NAME
8 49675 :tick-5sec
16189 nTrade
/export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_outmsg.ledger
32768
25456 pt_chmod
/export/data/dbxpt3/logs/NTRPT3-MOCA.log
32768
3 fsflush <none>
38912
25418 pt_chmod
/export/data/dbxpt3/logs/NTRPT3-MOCA.log 49152
21372 tail
/export/data/dbxpt3/logs/NTRPT3-MOCA.log 65536
16189 nTrade
/export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_exerep.ledger
81920
16189 nTrade
/export/data/dbxpt3/logs/ntrade.imbalances.log 114688
25419 iostat
/export/data/dbxpt3/logs/NTRPT3-MOCA.log 114688
8018 tail
/export/data/dbxpt3/logs/NTRPT3-MOCA.log 131072
24915 tail
/export/data/dbxpt3/logs/NTRPT3-MOCA.log 147456
16189 nTrade <none>
207872
20900 tail
/export/data/dbxpt3/logs/NTRPT3-MOCA.log 270336
0 sched <none>
782336
16189 nTrade
/export/data/dbxpt3/logs/NTRPT3-MOCA.log 2162688
the write is about 10MB/s, did the above dtrace script tell the real IO going
on at that time ?
is there a way to find how many IO generate by processes, and how many IO are
in the IO wait queue ?
is there a way to find out the disk RPM besides checking the physical drive ?
Thanks,
James Yang
---
This communication may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this communication
in error) please notify the sender immediately and destroy this
communication. Any unauthorized copying, disclosure or distribution of the
material in this communication is strictly forbidden.
Deutsche Bank does not render legal or tax advice, and the information
contained in this communication should not be regarded as such.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081217/5d22e6e6/attachment.html>
This is all very odd....iostat is historically extremely reliable. I''ve never observed stats like that before - zero reads and writes with a non-zero value in the wait queue (forget utilization when it comes to disk - it''s a useless metric). IO rates per process are best measured at the VOP layer. Depending on what version of Solaris you''re running, you can use the fsinfo provider (fsinfo::fop_read:entry, fsinfo::fop_write:entry). If you don''t have the fsinfo provider, instrument the syscall layer to track reads and writes. Can we get another sample, using "iostat -zxnd 1 20"? Does the application recover from the hang, or does it remain hung and require kill/restart? Thanks, /jim Jianhua Yang wrote:> > Hello, > > I use Brendan''s sysperfstat script to see the overall system > performance and found the the disk utilization is over 100: > > 15:51:38 14.52 15.01 200.00 24.42 0.00 0.00 83.53 0.00 > 15:51:42 11.37 15.01 200.00 25.48 0.00 0.00 88.43 0.00 > ------ Utilisation ------ ------ Saturation ------ > Time %CPU %Mem *%Disk* %Net CPU Mem *Disk* Net > 15:51:45 11.01 15.01* 200.00* 12.02 0.00 0.00 *95.03* 0.00 > 15:51:48 13.80 15.01 *200.00* 24.87 0.00 0.00 *98.86* 0.00 > 15:51:51 9.44 15.01 *200.00* 17.02 0.00 0.00 *102.64* 0.00 > 15:51:54 9.49 15.01 *164.59* 9.10 0.00 0.00 *83.75* 0.00 > 15:51:57 16.58 15.01 *2.83* 20.46 0.00 0.00 0.00 0.00 > > how can I fix this ? is there new verion of this script ? > > my system is X4600-M1 with hardware RAID of > 0+1 = OS disk =72 GB = d0 > 2+3 = apps data disk = 146 GB = d2, SVM soft partition with one UFS > file system is active > at that time, iostat showed strange output: > cpu > us sy wt id > 13 9 0 78 > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30 > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40 > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52 > 0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0 > cpu > us sy wt id > 10 5 0 85 > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30 > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40 > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52 > 0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0 > kr/s & kw/s show 0, but wait is 334, > > at this, the application always hang. > > # dtrace -n ''io:::start { @files[pid, execname, args[2]->fi_pathname] > = sum(args[0]->b_bcount); } tick-5sec { exit(); }'' > dtrace: description ''io:::start '' matched 7 probes > CPU ID FUNCTION:NAME > 8 49675 :tick-5sec > > 16189 nTrade > /export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_outmsg.ledger > 32768 > 25456 pt_chmod /export/data/dbxpt3/logs/NTRPT3-MOCA.log 32768 > 3 fsflush <none> 38912 > 25418 pt_chmod /export/data/dbxpt3/logs/NTRPT3-MOCA.log 49152 > 21372 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 65536 > 16189 nTrade > /export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_exerep.ledger > 81920 > 16189 nTrade /export/data/dbxpt3/logs/ntrade.imbalances.log 114688 > 25419 iostat /export/data/dbxpt3/logs/NTRPT3-MOCA.log 114688 > 8018 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 131072 > 24915 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 147456 > 16189 nTrade <none> 207872 > 20900 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 270336 > 0 sched <none> 782336 > 16189 nTrade /export/data/dbxpt3/logs/NTRPT3-MOCA.log 2162688 > > the write is about 10MB/s, did the above dtrace script tell the real > IO going on at that time ? > is there a way to find how many IO generate by processes, and how many > IO are in the IO wait queue ? > is there a way to find out the disk RPM besides checking the physical > drive ? > > Thanks, > > James Yang > --- > This communication may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this communication > in error) please notify the sender immediately and destroy this > communication. Any unauthorized copying, disclosure or distribution of the > material in this communication is strictly forbidden. > > Deutsche Bank does not render legal or tax advice, and the information > contained in this communication should not be regarded as such. > > ------------------------------------------------------------------------ > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
Jim,
I''ll run "iostat -zxnd 1 20" tomorrow when application start.
the application recovered after about 40 seconds by itself after the disk I/O
dropped.
Thanks,
James Yang
Global Unix Support, IES, GTO
Deutsche Bank US
Phone: 201-593-1360
Email : jianhua.yang at db.com
Pager : 1-800-946-4646 PIN# 6105618
CR: NYC_UNIX_ES_US_UNIX_SUPPORT
http://dcsupport.ies.gto.intranet.db.com/
---
This communication may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this communication
in error) please notify the sender immediately and destroy this
communication. Any unauthorized copying, disclosure or distribution of the
material in this communication is strictly forbidden.
Deutsche Bank does not render legal or tax advice, and the information
contained in this communication should not be regarded as such.
Jim Mauro
<James.Mauro at Sun.C
OM> To
Sent by: Jianhua Yang/db/dbcom at DBAmericas
James.Mauro at Sun.CO cc
M dtrace-discuss at opensolaris.org
Subject
Re: [dtrace-discuss] disk utilization
12/17/08 07:02 PM is over 200%
This is all very odd....iostat is historically extremely reliable.
I''ve never observed stats like that before - zero reads and writes
with a non-zero value in the wait queue (forget utilization when
it comes to disk - it''s a useless metric).
IO rates per process are best measured at the VOP layer.
Depending on what version of Solaris you''re running, you
can use the fsinfo provider (fsinfo::fop_read:entry,
fsinfo::fop_write:entry). If you don''t have the fsinfo
provider, instrument the syscall layer to track reads and writes.
Can we get another sample, using "iostat -zxnd 1 20"?
Does the application recover from the hang, or does it
remain hung and require kill/restart?
Thanks,
/jim
Jianhua Yang wrote:>
> Hello,
>
> I use Brendan''s sysperfstat script to see the overall system
> performance and found the the disk utilization is over 100:
>
> 15:51:38 14.52 15.01 200.00 24.42 0.00 0.00 83.53 0.00
> 15:51:42 11.37 15.01 200.00 25.48 0.00 0.00 88.43 0.00
> ------ Utilisation ------ ------ Saturation ------
> Time %CPU %Mem *%Disk* %Net CPU Mem *Disk* Net
> 15:51:45 11.01 15.01* 200.00* 12.02 0.00 0.00 *95.03* 0.00
> 15:51:48 13.80 15.01 *200.00* 24.87 0.00 0.00 *98.86* 0.00
> 15:51:51 9.44 15.01 *200.00* 17.02 0.00 0.00 *102.64* 0.00
> 15:51:54 9.49 15.01 *164.59* 9.10 0.00 0.00 *83.75* 0.00
> 15:51:57 16.58 15.01 *2.83* 20.46 0.00 0.00 0.00 0.00
>
> how can I fix this ? is there new verion of this script ?
>
> my system is X4600-M1 with hardware RAID of
> 0+1 = OS disk =72 GB = d0
> 2+3 = apps data disk = 146 GB = d2, SVM soft partition with one UFS
> file system is active
> at that time, iostat showed strange output:
> cpu
> us sy wt id
> 13 9 0 78
> extended device statistics
> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52
> 0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0
> cpu
> us sy wt id
> 10 5 0 85
> extended device statistics
> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52
> 0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0
> kr/s & kw/s show 0, but wait is 334,
>
> at this, the application always hang.
>
> # dtrace -n ''io:::start { @files[pid, execname,
args[2]->fi_pathname]
> = sum(args[0]->b_bcount); } tick-5sec { exit(); }''
> dtrace: description ''io:::start '' matched 7 probes
> CPU ID FUNCTION:NAME
> 8 49675 :tick-5sec
>
> 16189 nTrade
> /export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_outmsg.ledger
> 32768
> 25456 pt_chmod /export/data/dbxpt3/logs/NTRPT3-MOCA.log 32768
> 3 fsflush <none> 38912
> 25418 pt_chmod /export/data/dbxpt3/logs/NTRPT3-MOCA.log 49152
> 21372 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 65536
> 16189 nTrade
> /export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_exerep.ledger
> 81920
> 16189 nTrade /export/data/dbxpt3/logs/ntrade.imbalances.log 114688
> 25419 iostat /export/data/dbxpt3/logs/NTRPT3-MOCA.log 114688
> 8018 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 131072
> 24915 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 147456
> 16189 nTrade <none> 207872
> 20900 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 270336
> 0 sched <none> 782336
> 16189 nTrade /export/data/dbxpt3/logs/NTRPT3-MOCA.log 2162688
>
> the write is about 10MB/s, did the above dtrace script tell the real
> IO going on at that time ?
> is there a way to find how many IO generate by processes, and how many
> IO are in the IO wait queue ?
> is there a way to find out the disk RPM besides checking the physical
> drive ?
>
> Thanks,
>
> James Yang
> ---
> This communication may contain confidential and/or privileged information.
> If you are not the intended recipient (or have received this communication
> in error) please notify the sender immediately and destroy this
> communication. Any unauthorized copying, disclosure or distribution of the
> material in this communication is strictly forbidden.
>
> Deutsche Bank does not render legal or tax advice, and the information
> contained in this communication should not be regarded as such.
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081217/31b41757/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081217/31b41757/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pic24774.gif
Type: image/gif
Size: 1255 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081217/31b41757/attachment-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081217/31b41757/attachment-0002.gif>
Brendan Gregg - Sun Microsystems
2008-Dec-18 03:32 UTC
[dtrace-discuss] disk utilization is over 200%
G''Day, On Wed, Dec 17, 2008 at 06:12:27PM -0500, Jianhua Yang wrote:> > Hello, > I use Brendan''s sysperfstat script to see the overall system > performance and found the the disk utilization is over 100: > 15:51:38 14.52 15.01 200.00 24.42 0.00 0.00 83.53 0.00 > 15:51:42 11.37 15.01 200.00 25.48 0.00 0.00 88.43 0.00 > ------ Utilisation ------ ------ Saturation ------ > Time %CPU %Mem %Disk %Net CPU Mem Disk Net > 15:51:45 11.01 15.01 200.00 12.02 0.00 0.00 95.03 0.00 > 15:51:48 13.80 15.01 200.00 24.87 0.00 0.00 98.86 0.00 > 15:51:51 9.44 15.01 200.00 17.02 0.00 0.00 102.64 0.00 > 15:51:54 9.49 15.01 164.59 9.10 0.00 0.00 83.75 0.00 > 15:51:57 16.58 15.01 2.83 20.46 0.00 0.00 0.00 0.00Disk and Network are in terms of a single disk or network interface. From the header of sysperfstat: # The utilisation values for CPU and Memory have maximum values of 100%, # Disk and Network don''t. 100% CPU means all CPUs are running at 100%, however # 100% Disk means perhaps 1 disk is running at 100%, or 2 disks at 50%; # a similar calculation is used for Network. There are some sensible # reasons behind this decision that I hope to document at some point. Sorry for the confusion. I''ve err''d on the side of false-positives rather than false-negatives, as your next step is now to drill down further and use iostat as Jim suggested. sysperfstat is a pretty simple Perl script anyway - you can customise it as needed. Brendan -- Brendan Gregg, Sun Microsystems Fishworks. http://blogs.sun.com/brendan
thanks, Brendan !!!
Thanks,
James Yang
Global Unix Support, IES, GTO
Deutsche Bank US
Phone: 201-593-1360
Email : jianhua.yang at db.com
Pager : 1-800-946-4646 PIN# 6105618
CR: NYC_UNIX_ES_US_UNIX_SUPPORT
http://dcsupport.ies.gto.intranet.db.com/
---
This communication may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this communication
in error) please notify the sender immediately and destroy this
communication. Any unauthorized copying, disclosure or distribution of the
material in this communication is strictly forbidden.
Deutsche Bank does not render legal or tax advice, and the information
contained in this communication should not be regarded as such.
Brendan Gregg -
Sun Microsystems
<brendan at sun.com>
To
Jianhua Yang/db/dbcom at DBAmericas
12/17/08 10:32 PM cc
dtrace-discuss at opensolaris.org
Subject
Re: [dtrace-discuss] disk utilization
is over 200%
G''Day,
On Wed, Dec 17, 2008 at 06:12:27PM -0500, Jianhua Yang
wrote:>
> Hello,
> I use Brendan''s sysperfstat script to see the overall system
> performance and found the the disk utilization is over 100:
> 15:51:38 14.52 15.01 200.00 24.42 0.00 0.00 83.53 0.00
> 15:51:42 11.37 15.01 200.00 25.48 0.00 0.00 88.43 0.00
> ------ Utilisation ------ ------ Saturation ------
> Time %CPU %Mem %Disk %Net CPU Mem Disk Net
> 15:51:45 11.01 15.01 200.00 12.02 0.00 0.00 95.03 0.00
> 15:51:48 13.80 15.01 200.00 24.87 0.00 0.00 98.86 0.00
> 15:51:51 9.44 15.01 200.00 17.02 0.00 0.00 102.64 0.00
> 15:51:54 9.49 15.01 164.59 9.10 0.00 0.00 83.75 0.00
> 15:51:57 16.58 15.01 2.83 20.46 0.00 0.00 0.00 0.00
Disk and Network are in terms of a single disk or network interface. From
the header of sysperfstat:
# The utilisation values for CPU and Memory have maximum values of 100%,
# Disk and Network don''t. 100% CPU means all CPUs are running at 100%,
however
# 100% Disk means perhaps 1 disk is running at 100%, or 2 disks at 50%;
# a similar calculation is used for Network. There are some sensible
# reasons behind this decision that I hope to document at some point.
Sorry for the confusion. I''ve err''d on the side of
false-positives rather
than false-negatives, as your next step is now to drill down further and use
iostat as Jim suggested.
sysperfstat is a pretty simple Perl script anyway - you can customise it
as needed.
Brendan
--
Brendan Gregg, Sun Microsystems Fishworks. http://blogs.sun.com/brendan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081218/53991839/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081218/53991839/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pic08714.gif
Type: image/gif
Size: 1255 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081218/53991839/attachment-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081218/53991839/attachment-0002.gif>
You can as well take a look on SDR, uses sysperfstat and others behind under SMF. Im working on enhancing sysperfstat to take count of ZFS and fix some bugs I found. I have on my todo list to make Disk, Net I/O utilisation 100% same as CPU, Mem. http://www.nbl.fi/~nbl97/solaris/perf/index.html stefan
Hello Jim,
pls the attached outout file (See attached file: io.out.20081218)
I ran iosnoop -eo, the script hang until the iostat is less 100% busy.
Thanks,
James Yang
Global Unix Support, IES, GTO
Deutsche Bank US
Phone: 201-593-1360
Email : jianhua.yang at db.com
Pager : 1-800-946-4646 PIN# 6105618
CR: NYC_UNIX_ES_US_UNIX_SUPPORT
http://dcsupport.ies.gto.intranet.db.com/
---
This communication may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this communication
in error) please notify the sender immediately and destroy this
communication. Any unauthorized copying, disclosure or distribution of the
material in this communication is strictly forbidden.
Deutsche Bank does not render legal or tax advice, and the information
contained in this communication should not be regarded as such.
Jim Mauro
<James.Mauro at Sun.C
OM> To
Sent by: Jianhua Yang/db/dbcom at DBAmericas
James.Mauro at Sun.CO cc
M dtrace-discuss at opensolaris.org
Subject
Re: [dtrace-discuss] disk utilization
12/17/08 07:02 PM is over 200%
This is all very odd....iostat is historically extremely reliable.
I''ve never observed stats like that before - zero reads and writes
with a non-zero value in the wait queue (forget utilization when
it comes to disk - it''s a useless metric).
IO rates per process are best measured at the VOP layer.
Depending on what version of Solaris you''re running, you
can use the fsinfo provider (fsinfo::fop_read:entry,
fsinfo::fop_write:entry). If you don''t have the fsinfo
provider, instrument the syscall layer to track reads and writes.
Can we get another sample, using "iostat -zxnd 1 20"?
Does the application recover from the hang, or does it
remain hung and require kill/restart?
Thanks,
/jim
Jianhua Yang wrote:>
> Hello,
>
> I use Brendan''s sysperfstat script to see the overall system
> performance and found the the disk utilization is over 100:
>
> 15:51:38 14.52 15.01 200.00 24.42 0.00 0.00 83.53 0.00
> 15:51:42 11.37 15.01 200.00 25.48 0.00 0.00 88.43 0.00
> ------ Utilisation ------ ------ Saturation ------
> Time %CPU %Mem *%Disk* %Net CPU Mem *Disk* Net
> 15:51:45 11.01 15.01* 200.00* 12.02 0.00 0.00 *95.03* 0.00
> 15:51:48 13.80 15.01 *200.00* 24.87 0.00 0.00 *98.86* 0.00
> 15:51:51 9.44 15.01 *200.00* 17.02 0.00 0.00 *102.64* 0.00
> 15:51:54 9.49 15.01 *164.59* 9.10 0.00 0.00 *83.75* 0.00
> 15:51:57 16.58 15.01 *2.83* 20.46 0.00 0.00 0.00 0.00
>
> how can I fix this ? is there new verion of this script ?
>
> my system is X4600-M1 with hardware RAID of
> 0+1 = OS disk =72 GB = d0
> 2+3 = apps data disk = 146 GB = d2, SVM soft partition with one UFS
> file system is active
> at that time, iostat showed strange output:
> cpu
> us sy wt id
> 13 9 0 78
> extended device statistics
> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52
> 0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0
> cpu
> us sy wt id
> 10 5 0 85
> extended device statistics
> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40
> 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52
> 0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0
> kr/s & kw/s show 0, but wait is 334,
>
> at this, the application always hang.
>
> # dtrace -n ''io:::start { @files[pid, execname,
args[2]->fi_pathname]
> = sum(args[0]->b_bcount); } tick-5sec { exit(); }''
> dtrace: description ''io:::start '' matched 7 probes
> CPU ID FUNCTION:NAME
> 8 49675 :tick-5sec
>
> 16189 nTrade
> /export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_outmsg.ledger
> 32768
> 25456 pt_chmod /export/data/dbxpt3/logs/NTRPT3-MOCA.log 32768
> 3 fsflush <none> 38912
> 25418 pt_chmod /export/data/dbxpt3/logs/NTRPT3-MOCA.log 49152
> 21372 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 65536
> 16189 nTrade
> /export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_exerep.ledger
> 81920
> 16189 nTrade /export/data/dbxpt3/logs/ntrade.imbalances.log 114688
> 25419 iostat /export/data/dbxpt3/logs/NTRPT3-MOCA.log 114688
> 8018 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 131072
> 24915 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 147456
> 16189 nTrade <none> 207872
> 20900 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 270336
> 0 sched <none> 782336
> 16189 nTrade /export/data/dbxpt3/logs/NTRPT3-MOCA.log 2162688
>
> the write is about 10MB/s, did the above dtrace script tell the real
> IO going on at that time ?
> is there a way to find how many IO generate by processes, and how many
> IO are in the IO wait queue ?
> is there a way to find out the disk RPM besides checking the physical
> drive ?
>
> Thanks,
>
> James Yang
> ---
> This communication may contain confidential and/or privileged information.
> If you are not the intended recipient (or have received this communication
> in error) please notify the sender immediately and destroy this
> communication. Any unauthorized copying, disclosure or distribution of the
> material in this communication is strictly forbidden.
>
> Deutsche Bank does not render legal or tax advice, and the information
> contained in this communication should not be regarded as such.
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081218/5f797c8a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081218/5f797c8a/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pic07572.gif
Type: image/gif
Size: 1255 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081218/5f797c8a/attachment-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081218/5f797c8a/attachment-0002.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: io.out.20081218
Type: application/octet-stream
Size: 14316 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081218/5f797c8a/attachment.obj>
Brendan Gregg - Sun Microsystems
2008-Dec-19 04:02 UTC
[dtrace-discuss] disk utilization is over 200%
On Thu, Dec 18, 2008 at 08:53:36PM +0000, Stefan Parvu wrote:> You can as well take a look on SDR, uses sysperfstat and > others behind under SMF. Im working on enhancing sysperfstat > to take count of ZFS and fix some bugs I found. I have > on my todo list to make Disk, Net I/O utilisation 100% same as CPU, Mem.Not the same, I hope. If network utilization was reported as 12%, what does that mean? It''s meaninigless without knowing how many interfaces there are - if there if there were eight - it could mean that one interface is at 99% utilization. All users of sysperfstat would need to first run ifconfig -a and iostat -E to make sense of the sysperfstat output. The risk here is that users may not identify real system problems - which is bad. By printing a utilization sum, the user doesn''t need to know how many interfaces there are to identify a saturated interface - it will always show as 100%. I didn''t do this for memory banks or CPUs, since as resources they are used differently. The risk with summing percentages is that users get confused by > 100% values, if they don''t read the sysperfstat docs or investigate using other tools to see what is really going on. Bad, but not as bad as hiding real problems. My mistake, I think, was to print "%Disk" and "%Net" - that should be "S%Disk" and "S%Net", or something, to highlight that these are sums of percentages and not plain percentages. Someone (Stefan?) did suggest to me at some point a better solution - instead of printing sum percentages for disk and net, print max percentages. That way it never goes over 100% (avoiding some of the confusion), and still identifys individual network and disk problems. Stefan - I haven''t had a chance to code that, but you are welcome to do that if it sounds sensible. I''d print them as "M%Disk" and "M%Net" or something. Brendan -- Brendan Gregg, Sun Microsystems Fishworks. http://blogs.sun.com/brendan
> are used differently. The risk with summing percentages is > that users get > confused by > 100% values, if they don''t read the > sysperfstat docs or > investigate using other tools to see what is really going > on. Bad, but not > as bad as hiding real problems.correct and this works good enough if users are informed what Disk IO and Net IO is.> of printing sum percentages for disk and net, print max > percentages. That > way it never goes over 100% (avoiding some of the > confusion), and still > identifys individual network and disk problems. Stefan - I > haven''t had a > chance to code that, but you are welcome to do that if it > sounds sensible. > I''d print them as "M%Disk" and > "M%Net" or something.yep, this would be easy to clear the confusion. At this moment Im happy the way sysperfstat works but for future we need to fix this. stefan
Hmm...the iostat data does not make sense. Consistent non-zero values (12/13) in the wait queue, with an IO rate of zero (no read, no writes). Is this data from a virtualization environment? An LDOM, a Xen domU, a VMware virtual machine, or a Solaris Zone? Thanks, /jim Jianhua Yang wrote:> > Hello Jim, > > pls the attached outout file /(See attached file: io.out.20081218)/ > > I ran iosnoop -eo, the script hang until the iostat is less 100% busy. > > Thanks, > > James Yang > Global Unix Support, IES, GTO > Deutsche Bank US > Phone: 201-593-1360 > Email : jianhua.yang at db.com > Pager : 1-800-946-4646 PIN# 6105618 > CR: NYC_UNIX_ES_US_UNIX_SUPPORT > http://dcsupport.ies.gto.intranet.db.com/ > > > --- > This communication may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this communication > in error) please notify the sender immediately and destroy this > communication. Any unauthorized copying, disclosure or distribution of the > material in this communication is strictly forbidden. > > Deutsche Bank does not render legal or tax advice, and the information > contained in this communication should not be regarded as > such.Inactive hide details for Jim Mauro <James.Mauro at Sun.COM>Jim > Mauro <James.Mauro at Sun.COM> > > > *Jim Mauro <James.Mauro at Sun.COM>* > Sent by: James.Mauro at Sun.COM > > 12/17/08 07:02 PM > > > > To > > Jianhua Yang/db/dbcom at DBAmericas > > cc > > dtrace-discuss at opensolaris.org > > Subject > > Re: [dtrace-discuss] disk utilization is over 200% > > > > > This is all very odd....iostat is historically extremely reliable. > I''ve never observed stats like that before - zero reads and writes > with a non-zero value in the wait queue (forget utilization when > it comes to disk - it''s a useless metric). > > IO rates per process are best measured at the VOP layer. > Depending on what version of Solaris you''re running, you > can use the fsinfo provider (fsinfo::fop_read:entry, > fsinfo::fop_write:entry). If you don''t have the fsinfo > provider, instrument the syscall layer to track reads and writes. > > Can we get another sample, using "iostat -zxnd 1 20"? > > Does the application recover from the hang, or does it > remain hung and require kill/restart? > > Thanks, > /jim > > > Jianhua Yang wrote: > > > > Hello, > > > > I use Brendan''s sysperfstat script to see the overall system > > performance and found the the disk utilization is over 100: > > > > 15:51:38 14.52 15.01 200.00 24.42 0.00 0.00 83.53 0.00 > > 15:51:42 11.37 15.01 200.00 25.48 0.00 0.00 88.43 0.00 > > ------ Utilisation ------ ------ Saturation ------ > > Time %CPU %Mem *%Disk* %Net CPU Mem *Disk* Net > > 15:51:45 11.01 15.01* 200.00* 12.02 0.00 0.00 *95.03* 0.00 > > 15:51:48 13.80 15.01 *200.00* 24.87 0.00 0.00 *98.86* 0.00 > > 15:51:51 9.44 15.01 *200.00* 17.02 0.00 0.00 *102.64* 0.00 > > 15:51:54 9.49 15.01 *164.59* 9.10 0.00 0.00 *83.75* 0.00 > > 15:51:57 16.58 15.01 *2.83* 20.46 0.00 0.00 0.00 0.00 > > > > how can I fix this ? is there new verion of this script ? > > > > my system is X4600-M1 with hardware RAID of > > 0+1 = OS disk =72 GB = d0 > > 2+3 = apps data disk = 146 GB = d2, SVM soft partition with one UFS > > file system is active > > at that time, iostat showed strange output: > > cpu > > us sy wt id > > 13 9 0 78 > > extended device statistics > > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30 > > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40 > > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52 > > 0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0 > > cpu > > us sy wt id > > 10 5 0 85 > > extended device statistics > > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30 > > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40 > > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52 > > 0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0 > > kr/s & kw/s show 0, but wait is 334, > > > > at this, the application always hang. > > > > # dtrace -n ''io:::start { @files[pid, execname, args[2]->fi_pathname] > > = sum(args[0]->b_bcount); } tick-5sec { exit(); }'' > > dtrace: description ''io:::start '' matched 7 probes > > CPU ID FUNCTION:NAME > > 8 49675 :tick-5sec > > > > 16189 nTrade > > > /export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_outmsg.ledger > > 32768 > > 25456 pt_chmod /export/data/dbxpt3/logs/NTRPT3-MOCA.log 32768 > > 3 fsflush <none> 38912 > > 25418 pt_chmod /export/data/dbxpt3/logs/NTRPT3-MOCA.log 49152 > > 21372 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 65536 > > 16189 nTrade > > > /export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_exerep.ledger > > 81920 > > 16189 nTrade /export/data/dbxpt3/logs/ntrade.imbalances.log 114688 > > 25419 iostat /export/data/dbxpt3/logs/NTRPT3-MOCA.log 114688 > > 8018 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 131072 > > 24915 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 147456 > > 16189 nTrade <none> 207872 > > 20900 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 270336 > > 0 sched <none> 782336 > > 16189 nTrade /export/data/dbxpt3/logs/NTRPT3-MOCA.log 2162688 > > > > the write is about 10MB/s, did the above dtrace script tell the real > > IO going on at that time ? > > is there a way to find how many IO generate by processes, and how many > > IO are in the IO wait queue ? > > is there a way to find out the disk RPM besides checking the physical > > drive ? > > > > Thanks, > > > > James Yang > > --- > > This communication may contain confidential and/or privileged > information. > > If you are not the intended recipient (or have received this > communication > > in error) please notify the sender immediately and destroy this > > communication. Any unauthorized copying, disclosure or distribution > of the > > material in this communication is strictly forbidden. > > > > Deutsche Bank does not render legal or tax advice, and the information > > contained in this communication should not be regarded as such. > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > dtrace-discuss mailing list > > dtrace-discuss at opensolaris.org > > > > ------------------------------------------------------------------------ > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
No, this X4600 M1 runs Solaris 10 Update 3 with no local zone,
# uname -a
SunOS nygeqdbxpc2p1 5.10 Generic_127112-02 i86pc i386 i86pc
# cat /etc/release
Solaris 10 11/06 s10x_u3wos_10 X86
Copyright 2006 Sun Microsystems, Inc. All Rights Reserved.
Use is subject to license terms.
Assembled 14 November 2006
# zoneadm list -cv
ID NAME STATUS PATH BRAND IP
0 global running / native
shared
#
# df -kl
Filesystem kbytes used avail capacity Mounted on
/dev/md/dsk/d0 8266719 2715437 5468615 34% /
/devices 0 0 0 0% /devices
ctfs 0 0 0 0% /system/contract
proc 0 0 0 0% /proc
mnttab 0 0 0 0% /etc/mnttab
swap 61579940 776 61579164 1% /etc/svc/volatile
objfs 0 0 0 0% /system/object
/usr/lib/libc/libc_hwcap2.so.1
8266719 2715437 5468615 34% /lib/libc.so.1
fd 0 0 0 0% /dev/fd
swap 62419216 840052 61579164 2% /tmp
swap 61579188 24 61579164 1% /var/run
/dev/md/dsk/d51 986735 1060 926471 1% /export/home/dbxpt3
/export/home/dbxpt3 986735 1060 926471 1% /home/dbxpt3
/dev/md/dsk/d52 10326524 1743828 8479431 18% /export/data/sor-prod
/dev/md/dsk/d50 61962204 353899 60988683 1% /export/data/dbxpt3
/export/data/dbxpt3 61962204 353899 60988683 1% /data/dbxpt3
#
# raidctl
RAID Volume RAID RAID Disk
Volume Type Status Disk Status
------------------------------------------------------
c3t0d0 IM OK c3t0d0 OK
c3t1d0 OK
c3t2d0 IM OK c3t2d0 OK
c3t3d0 OK
Fri Dec 19 10:07:37 2008
cpu
us sy wt id
6 4 0 90
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.0 0.0 0.0 0.0 18.3 0.0 0.0 0 100 md/d30
0.0 0.0 0.0 0.0 0.0 18.3 0.0 0.0 0 100 md/d40
0.0 0.0 0.0 0.0 0.0 18.3 0.0 0.0 0 100 md/d50
0.0 0.0 0.0 0.0 17.3 1.0 0.0 0.0 100 100 c3t2d0
Fri Dec 19 10:07:42 2008
cpu
us sy wt id
4 3 0 93
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.0 0.0 0.0 0.0 20.0 0.0 0.0 0 100 md/d30
0.0 0.0 0.0 0.0 0.0 20.0 0.0 0.0 0 100 md/d40
0.0 0.0 0.0 0.0 0.0 20.0 0.0 0.0 0 100 md/d50
0.0 0.0 0.0 0.0 19.0 1.0 0.0 0.0 100 100 c3t2d0
Fri Dec 19 10:07:47 2008
cpu
us sy wt id
4 3 0 93
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.0 0.0 0.0 0.0 20.5 0.0 0.0 0 100 md/d30
0.0 0.0 0.0 0.0 0.0 20.5 0.0 0.0 0 100 md/d40
0.0 0.0 0.0 0.0 0.0 20.5 0.0 0.0 0 100 md/d50
0.0 0.0 0.0 0.0 19.5 1.0 0.0 0.0 100 100 c3t2d0
Fri Dec 19 10:07:52 2008
cpu
us sy wt id
5 3 0 91
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.0 0.0 0.0 0.0 21.0 0.0 0.0 0 100 md/d30
0.0 0.0 0.0 0.0 0.0 21.0 0.0 0.0 0 100 md/d40
0.0 0.0 0.0 0.0 0.0 21.0 0.0 0.0 0 100 md/d50
0.0 0.0 0.0 0.0 20.0 1.0 0.0 0.0 100 100 c3t2d0
Fri Dec 19 10:07:57 2008
cpu
us sy wt id
4 3 0 92
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 12.8 0.0 184.7 0.0 3.6 0.0 278.6 0 19 md/d30
0.0 12.8 0.0 184.7 0.0 3.4 0.0 268.3 0 19 md/d40
0.0 6.4 0.0 108.3 0.0 3.4 0.0 536.7 0 19 md/d50
0.0 12.8 0.0 184.7 2.9 0.5 229.7 38.6 15 19 c3t2d0
Fri Dec 19 10:08:02 2008
cpu
us sy wt id
4 3 0 93
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.2 0.0 4.2 0.0 0.0 0.0 7.3 0 0 md/d30
0.0 0.2 0.0 4.2 0.0 0.0 0.0 7.3 0 0 md/d40
0.0 0.2 0.0 4.2 0.0 0.0 0.0 7.4 0 0 md/d50
0.0 0.2 0.0 4.2 0.0 0.0 0.0 7.3 0 0 c3t2d0
I will collect mpstat and vmstat statistics to see any unusual activities on
the system.
James Yang
---
This communication may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this communication
in error) please notify the sender immediately and destroy this
communication. Any unauthorized copying, disclosure or distribution of the
material in this communication is strictly forbidden.
Deutsche Bank does not render legal or tax advice, and the information
contained in this communication should not be regarded as such.
Jim Mauro
<James.Mauro at Sun.C
OM> To
Sent by: Jianhua Yang/db/dbcom at DBAmericas
James.Mauro at Sun.CO cc
M dtrace-discuss at opensolaris.org
Subject
Re: [dtrace-discuss] disk utilization
12/20/08 09:57 PM is over 200%
Hmm...the iostat data does not make sense.
Consistent non-zero values (12/13) in the wait
queue, with an IO rate of zero (no read, no writes).
Is this data from a virtualization environment?
An LDOM, a Xen domU, a VMware virtual machine,
or a Solaris Zone?
Thanks,
/jim
Jianhua Yang wrote:>
> Hello Jim,
>
> pls the attached outout file /(See attached file: io.out.20081218)/
>
> I ran iosnoop -eo, the script hang until the iostat is less 100% busy.
>
> Thanks,
>
> James Yang
> Global Unix Support, IES, GTO
> Deutsche Bank US
> Phone: 201-593-1360
> Email : jianhua.yang at db.com
> Pager : 1-800-946-4646 PIN# 6105618
> CR: NYC_UNIX_ES_US_UNIX_SUPPORT
> http://dcsupport.ies.gto.intranet.db.com/
>
>
> ---
> This communication may contain confidential and/or privileged information.
> If you are not the intended recipient (or have received this communication
> in error) please notify the sender immediately and destroy this
> communication. Any unauthorized copying, disclosure or distribution of the
> material in this communication is strictly forbidden.
>
> Deutsche Bank does not render legal or tax advice, and the information
> contained in this communication should not be regarded as
> such.Inactive hide details for Jim Mauro <James.Mauro at Sun.COM>Jim
> Mauro <James.Mauro at Sun.COM>
>
>
> *Jim Mauro <James.Mauro at Sun.COM>*
> Sent by: James.Mauro at Sun.COM
>
> 12/17/08 07:02 PM
>
>
>
> To
>
> Jianhua Yang/db/dbcom at DBAmericas
>
> cc
>
> dtrace-discuss at opensolaris.org
>
> Subject
>
> Re: [dtrace-discuss] disk utilization is over 200%
>
>
>
>
> This is all very odd....iostat is historically extremely reliable.
> I''ve never observed stats like that before - zero reads and writes
> with a non-zero value in the wait queue (forget utilization when
> it comes to disk - it''s a useless metric).
>
> IO rates per process are best measured at the VOP layer.
> Depending on what version of Solaris you''re running, you
> can use the fsinfo provider (fsinfo::fop_read:entry,
> fsinfo::fop_write:entry). If you don''t have the fsinfo
> provider, instrument the syscall layer to track reads and writes.
>
> Can we get another sample, using "iostat -zxnd 1 20"?
>
> Does the application recover from the hang, or does it
> remain hung and require kill/restart?
>
> Thanks,
> /jim
>
>
> Jianhua Yang wrote:
> >
> > Hello,
> >
> > I use Brendan''s sysperfstat script to see the overall system
> > performance and found the the disk utilization is over 100:
> >
> > 15:51:38 14.52 15.01 200.00 24.42 0.00 0.00 83.53 0.00
> > 15:51:42 11.37 15.01 200.00 25.48 0.00 0.00 88.43 0.00
> > ------ Utilisation ------ ------ Saturation ------
> > Time %CPU %Mem *%Disk* %Net CPU Mem *Disk* Net
> > 15:51:45 11.01 15.01* 200.00* 12.02 0.00 0.00 *95.03* 0.00
> > 15:51:48 13.80 15.01 *200.00* 24.87 0.00 0.00 *98.86* 0.00
> > 15:51:51 9.44 15.01 *200.00* 17.02 0.00 0.00 *102.64* 0.00
> > 15:51:54 9.49 15.01 *164.59* 9.10 0.00 0.00 *83.75* 0.00
> > 15:51:57 16.58 15.01 *2.83* 20.46 0.00 0.00 0.00 0.00
> >
> > how can I fix this ? is there new verion of this script ?
> >
> > my system is X4600-M1 with hardware RAID of
> > 0+1 = OS disk =72 GB = d0
> > 2+3 = apps data disk = 146 GB = d2, SVM soft partition with one UFS
> > file system is active
> > at that time, iostat showed strange output:
> > cpu
> > us sy wt id
> > 13 9 0 78
> > extended device statistics
> > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
> > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30
> > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40
> > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52
> > 0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0
> > cpu
> > us sy wt id
> > 10 5 0 85
> > extended device statistics
> > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
> > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d30
> > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d40
> > 0.0 0.0 0.0 0.0 0.0 335.0 0.0 0.0 0 100 md/d52
> > 0.0 0.0 0.0 0.0 334.0 1.0 0.0 0.0 100 100 c3t2d0
> > kr/s & kw/s show 0, but wait is 334,
> >
> > at this, the application always hang.
> >
> > # dtrace -n ''io:::start { @files[pid, execname,
args[2]->fi_pathname]
> > = sum(args[0]->b_bcount); } tick-5sec { exit(); }''
> > dtrace: description ''io:::start '' matched 7 probes
> > CPU ID FUNCTION:NAME
> > 8 49675 :tick-5sec
> >
> > 16189 nTrade
> >
> /export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_outmsg.ledger
> > 32768
> > 25456 pt_chmod /export/data/dbxpt3/logs/NTRPT3-MOCA.log 32768
> > 3 fsflush <none> 38912
> > 25418 pt_chmod /export/data/dbxpt3/logs/NTRPT3-MOCA.log 49152
> > 21372 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 65536
> > 16189 nTrade
> >
> /export/data/dbxpt3/logs/ledgers/arinapt3.NTRPT3-MOCA.trans_exerep.ledger
> > 81920
> > 16189 nTrade /export/data/dbxpt3/logs/ntrade.imbalances.log 114688
> > 25419 iostat /export/data/dbxpt3/logs/NTRPT3-MOCA.log 114688
> > 8018 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 131072
> > 24915 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 147456
> > 16189 nTrade <none> 207872
> > 20900 tail /export/data/dbxpt3/logs/NTRPT3-MOCA.log 270336
> > 0 sched <none> 782336
> > 16189 nTrade /export/data/dbxpt3/logs/NTRPT3-MOCA.log 2162688
> >
> > the write is about 10MB/s, did the above dtrace script tell the real
> > IO going on at that time ?
> > is there a way to find how many IO generate by processes, and how many
> > IO are in the IO wait queue ?
> > is there a way to find out the disk RPM besides checking the physical
> > drive ?
> >
> > Thanks,
> >
> > James Yang
> > ---
> > This communication may contain confidential and/or privileged
> information.
> > If you are not the intended recipient (or have received this
> communication
> > in error) please notify the sender immediately and destroy this
> > communication. Any unauthorized copying, disclosure or distribution
> of the
> > material in this communication is strictly forbidden.
> >
> > Deutsche Bank does not render legal or tax advice, and the information
> > contained in this communication should not be regarded as such.
> >
> >
------------------------------------------------------------------------
> >
> > _______________________________________________
> > dtrace-discuss mailing list
> > dtrace-discuss at opensolaris.org
> >
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081221/2811a85b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081221/2811a85b/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pic32391.gif
Type: image/gif
Size: 1255 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081221/2811a85b/attachment-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ecblank.gif
Type: image/gif
Size: 45 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081221/2811a85b/attachment-0002.gif>
> Hmm...the iostat data does not make sense. > Consistent non-zero values (12/13) in the wait > queue, with an IO rate of zero (no read, no writes).This can occur if an I/O is taking a very long time to complete, or gets lost. Since it doesn''t complete, we can''t count it as completed, so it drives the utilization metric up. For typical systems, you shouldn''t see this, so I would check the hardware for faults -- I would expect to see some timeouts/retries. -- richard -- This message posted from opensolaris.org
I suspected this could be disk hardware issue too, but found nothing, the hardware raid shows OK status, and no any scsi errors were reported from syslog when the disk was 100% busy from iostat. dtrace will not return output until "b" of iostat dropped to less 100%. J.Yang dtrace-discuss-bounces at opensolaris.org wrote on 01/04/2009 11:53:28 PM:> > Hmm...the iostat data does not make sense. > > Consistent non-zero values (12/13) in the wait > > queue, with an IO rate of zero (no read, no writes). > > This can occur if an I/O is taking a very long time to > complete, or gets lost. Since it doesn''t complete, we > can''t count it as completed, so it drives the utilization > metric up. For typical systems, you shouldn''t see this, > so I would check the hardware for faults -- I would > expect to see some timeouts/retries. > -- richard > -- > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org--- This communication may contain confidential and/or privileged information. If you are not the intended recipient (or have received this communication in error) please notify the sender immediately and destroy this communication. Any unauthorized copying, disclosure or distribution of the material in this communication is strictly forbidden. Deutsche Bank does not render legal or tax advice, and the information contained in this communication should not be regarded as such. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20090105/8fb93fce/attachment.html>