Hello zfs-discuss,
I have an nfs server with zfs as a local file server.
System is snv_39 on SPARC.
There are 6 raid-z pools (p1-p6).
The problem is that I do not see any heavy traffic on network
interfaces nor using zpool iostat. However using just old iostat I can
see MUCH more traffic going to local disks. The difference is someting
like 10x - zpool iostat shows for example ~6MB/s of reads however
iostat shows ~50MB/s. The question is who''s lying?
As server is behaving not that good in regards to performance I
suspect iostat is more accurate. Or maybe zpool iostat shows only
''application data'' being transferred while iostat shows
''real'' IOs to
disks - would there be that big difference (checksums, what else?)???
On the other hand when I look at how much traffic is on network
interfaces it''s much closer to what I see using zpool iostat.
So maybe zfs introduces that much overhead after all and zpool iostats
shows app data being tranfered.
Clients mount resources using NFSv3 over TCP.
nfsd is set to have 2048 threads - all are utilized most of the day.
Below iostat and zpool iostat output - both run at the same time in
different terminals.
bash-3.00# iostat -xnzC 1 | egrep "devic| c4$"
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1095.5 1390.7 64252.7 13742.5 0.0 137.5 0.0 55.3 0 1051 c4
470.2 4394.0 28882.4 3462.0 0.0 748.7 0.0 153.9 0 5388 c4
893.7 3262.0 55206.1 3124.8 0.0 680.4 0.0 163.7 0 6391 c4
965.6 3043.7 61801.2 2727.4 0.0 358.0 0.0 89.3 0 5119 c4
1162.8 2422.9 73277.0 5953.1 0.0 506.9 0.0 141.4 0 5390 c4
1693.1 1292.4 98599.2 1806.1 0.0 538.3 0.0 180.3 0 5204 c4
1551.7 1343.3 99808.4 1142.5 0.0 899.3 0.0 310.6 0 6300 c4
624.2 4002.8 39899.0 3435.7 0.0 429.7 0.0 92.9 0 4048 c4
1017.1 2735.7 65866.1 5809.7 0.0 325.9 0.0 86.8 0 4425 c4
1038.9 2817.9 66914.1 4276.2 0.0 212.3 0.0 55.0 0 4241 c4
784.4 3410.0 48851.0 9078.4 0.0 349.9 0.0 83.4 0 4579 c4
732.3 3542.8 46408.7 8075.1 0.0 526.4 0.0 123.1 0 4075 c4
928.1 3108.3 54917.9 7490.8 0.0 811.8 0.0 201.1 0 5750 c4
931.0 2943.1 55627.1 10331.7 0.0 846.0 0.0 218.4 0 5795 c4
^C
bash-3.00#
bash-3.00# zpool iostat 1
capacity operations bandwidth
pool used avail read write read write
---------- ----- ----- ----- ----- ----- -----
p1 751G 64.6G 79 72 1.15M 920K
p2 738G 78.2G 64 92 1.06M 1.27M
p3 733G 83.1G 61 98 1.12M 1.28M
p4 665G 82.7G 5 11 51.8K 55.4K
p5 704G 43.9G 80 61 1.09M 873K
p6 697G 51.2G 73 67 1.04M 935K
---------- ----- ----- ----- ----- ----- -----
p1 751G 64.6G 13 128 276K 767K
p2 738G 78.2G 16 129 1.47M 704K
p3 733G 83.1G 10 192 388K 683K
p4 665G 82.7G 16 3 37.1K 5.24K
p5 704G 43.9G 11 172 34.2K 617K
p6 697G 51.2G 12 35 31.4K 140K
---------- ----- ----- ----- ----- ----- -----
p1 751G 64.6G 5 87 27.2K 739K
p2 738G 78.2G 15 93 39.6K 391K
p3 733G 83.1G 15 151 51.5K 298K
p4 665G 82.7G 73 27 1.07M 118K
p5 704G 43.9G 41 62 1.85M 317K
p6 697G 51.2G 16 152 75.8K 879K
---------- ----- ----- ----- ----- ----- -----
p1 751G 64.6G 39 83 211K 505K
p2 738G 78.2G 27 76 562K 396K
p3 733G 83.1G 38 77 109K 276K
p4 665G 82.7G 0 1 0 6.67K
p5 704G 43.9G 30 78 83.4K 596K
p6 697G 51.2G 29 85 110K 702K
---------- ----- ----- ----- ----- ----- -----
p1 751G 64.6G 394 39 1018K 3.09M
p2 738G 78.2G 12 157 29.0K 274K
p3 733G 83.1G 2 109 12.8K 844K
p4 665G 82.7G 3 4 14.3K 20.0K
p5 704G 43.9G 32 44 85.2K 527K
p6 697G 51.2G 62 47 3.93M 365K
---------- ----- ----- ----- ----- ----- -----
p1 751G 64.6G 159 0 421K 3.81K
p2 738G 78.2G 18 86 174K 407K
p3 733G 83.1G 28 89 121K 230K
p4 665G 82.7G 94 17 7.27M 43.3K
p5 704G 43.9G 25 0 225K 0
p6 697G 51.2G 80 28 7.10M 810K
---------- ----- ----- ----- ----- ----- -----
p1 751G 64.6G 287 4 736K 34.2K
p2 738G 78.2G 2 81 8.06K 389K
p3 733G 83.1G 9 57 19.0K 493K
p4 665G 82.7G 62 17 5.38M 70.6K
p5 704G 43.9G 28 18 315K 152K
p6 697G 51.2G 70 3 7.26M 133K
---------- ----- ----- ----- ----- ----- -----
p1 751G 64.6G 24 20 75.3K 239K
p2 738G 78.2G 36 228 576K 477K
p3 733G 83.1G 0 220 4.74K 662K
p4 665G 82.7G 33 12 323K 35.1K
p5 704G 43.9G 9 311 26.6K 1.28M
p6 697G 51.2G 31 2 87.4K 11.4K
---------- ----- ----- ----- ----- ----- -----
p1 751G 64.6G 28 1 148K 11.4K
p2 738G 78.2G 47 109 1.42M 1.11M
p3 733G 83.1G 24 243 73.3K 661K
p4 665G 82.7G 0 0 4.28K 0
p5 704G 43.9G 32 234 95.6K 1.32M
p6 697G 51.2G 66 24 177K 2.23M
---------- ----- ----- ----- ----- ----- -----
p1 751G 64.6G 12 6 29.4K 64.5K
p2 738G 78.2G 27 98 80.1K 1.46M
p3 733G 83.1G 21 71 171K 795K
p4 665G 82.7G 58 5 2.77M 19.4K
p5 704G 43.9G 23 92 61.1K 470K
p6 697G 51.2G 209 9 561K 428K
---------- ----- ----- ----- ----- ----- -----
p1 751G 64.6G 18 1 56.7K 125K
p2 738G 78.2G 17 76 48.2K 4.76M
p3 733G 83.1G 29 95 88.5K 1.07M
p4 665G 82.7G 10 112 146K 197K
p5 704G 43.9G 16 368 45.5K 815K
p6 697G 51.2G 26 129 79.6K 844K
---------- ----- ----- ----- ----- ----- -----
p1 751G 64.6G 28 49 76.2K 4.49M
p2 738G 78.2G 37 9 107K 209K
p3 733G 83.1G 27 138 198K 938K
p4 665G 82.7G 17 164 415K 258K
p5 704G 43.9G 7 223 29.9K 930K
p6 697G 51.2G 21 132 905K 458K
---------- ----- ----- ----- ----- ----- -----
^C
bash-3.00#
Example full iostat output (all disks, iostat -xnz 1)
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1032.2 2751.8 64054.4 7491.7 0.0 682.4 0.0 180.3 0 5182 c4
21.2 45.5 1158.6 214.8 0.0 3.6 0.0 54.0 0 84
c4t500000E0118AC370d0
28.3 38.4 1494.2 208.8 0.0 4.8 0.0 71.7 0 84
c4t500000E0118B0390d0
11.1 64.7 711.7 302.8 0.0 10.9 0.0 143.2 0 100
c4t500000E0118F1FD0d0
1.0 0.0 64.7 0.0 0.0 0.1 0.0 74.5 0 8
c4t500000E011C19D60d0
1.0 0.0 64.7 0.0 0.0 0.1 0.0 76.2 0 8
c4t500000E0118C3220d0
47.5 2.0 3105.7 35.4 0.0 6.2 0.0 125.6 0 71
c4t500000E011902FA0d0
8.1 64.7 517.6 296.7 0.0 9.9 0.0 135.8 0 100
c4t500000E0118F2190d0
0.0 62.7 0.0 45.5 0.0 6.4 0.0 102.9 0 81
c4t500000E0119091E0d0
54.6 3.0 3558.6 35.4 0.0 8.3 0.0 143.2 0 80
c4t500000E011903120d0
10.1 61.7 647.0 271.4 0.0 6.6 0.0 92.2 0 94
c4t500000E0118F2350d0
1.0 61.7 64.7 45.5 0.0 6.4 0.0 101.8 0 84
c4t500000E0119032A0d0
47.5 2.0 3170.4 35.4 0.0 6.6 0.0 133.8 0 70
c4t500000E011903260d0
2.0 64.7 129.4 48.0 0.0 6.5 0.0 97.9 0 91
c4t500000E011909320d0
0.0 41.4 0.0 35.9 0.0 4.5 0.0 108.5 0 55
c4t500000E011903300d0
2.0 63.7 129.4 47.5 0.0 6.5 0.0 99.5 0 87
c4t500000E011909300d0
2.0 43.5 129.4 36.4 0.0 4.6 0.0 102.2 0 62
c4t500000E011903340d0
45.5 2.0 3041.0 35.4 0.0 6.4 0.0 135.0 0 70
c4t500000E011903320d0
1.0 62.7 64.7 46.5 0.0 6.4 0.0 100.8 0 84
c4t500000E0119095A0d0
22.2 45.5 1223.3 214.8 0.0 3.6 0.0 52.6 0 86
c4t500000E01192B420d0
37.4 2.0 2523.4 35.4 0.0 4.5 0.0 115.0 0 59
c4t500000E01190E6D0d0
9.1 55.6 582.3 283.6 0.0 5.3 0.0 81.4 0 86
c4t500000E01190E6B0d0
57.6 2.0 3752.7 35.4 0.0 8.9 0.0 149.6 0 82
c4t500000E01190E750d0
43.5 2.0 2846.9 35.4 0.0 5.3 0.0 116.4 0 66
c4t500000E01190E7F0d0
56.6 2.0 3752.7 35.4 0.0 8.4 0.0 144.0 0 81
c4t500000E01190E730d0
29.3 44.5 1558.9 212.8 0.0 5.0 0.0 67.1 0 96
c4t500000E01192B540d0
9.1 65.7 582.3 313.9 0.0 11.2 0.0 149.4 0 100
c4t500000E0118EDB20d0
0.0 78.9 0.0 75.3 0.0 35.0 0.0 443.8 0 100
c4t500000E0119495A0d0
1.0 0.0 64.7 0.0 0.0 0.1 0.0 85.8 0 9
c4t500000E01194A6F0d0
28.3 46.5 1611.5 213.8 0.0 5.0 0.0 67.0 0 97
c4t500000E01194A610d0
10.1 63.7 711.7 300.3 0.0 8.5 0.0 114.9 0 100
c4t500000E0118EDCC0d0
12.1 61.7 776.4 300.8 0.0 11.0 0.0 149.7 0 100
c4t500000E0118EDCA0d0
0.0 78.9 0.0 76.3 0.0 35.0 0.0 443.8 0 100
c4t500000E01194A750d0
0.0 78.9 0.0 76.3 0.0 35.0 0.0 443.8 0 100
c4t500000E01194A710d0
0.0 78.9 0.0 79.4 0.0 35.0 0.0 443.8 0 100
c4t500000E01194A730d0
0.0 78.9 0.0 75.8 0.0 35.0 0.0 443.8 0 100
c4t500000E01194A810d0
25.3 42.5 1300.1 208.3 0.0 4.4 0.0 64.3 0 85
c4t500000E0118C3230d0
9.1 58.6 582.3 262.8 0.0 5.9 0.0 86.4 0 88
c4t500000E0118F2060d0
42.5 3.0 2782.2 35.4 0.0 5.7 0.0 125.7 0 66
c4t500000E011902FB0d0
43.5 2.0 2846.9 35.4 0.0 5.3 0.0 115.7 0 65
c4t500000E0119030D0d0
2.0 64.7 129.4 47.5 0.0 6.7 0.0 99.9 0 87
c4t500000E011903030d0
11.1 61.7 711.7 303.3 0.0 9.6 0.0 131.7 0 98
c4t500000E0118F21C0d0
7.1 59.6 452.9 270.9 0.0 5.2 0.0 78.4 0 90
c4t500000E0118F2180d0
43.5 2.0 2846.9 35.4 0.0 5.6 0.0 122.3 0 68
c4t500000E0119030F0d0
60.7 2.0 3946.8 35.4 0.0 10.1 0.0 160.7 0 85
c4t500000E0119031B0d0
1.0 41.4 64.7 33.9 0.0 4.6 0.0 108.7 0 56
c4t500000E011903190d0
3.0 58.6 194.1 43.5 0.0 6.4 0.0 104.1 0 80
c4t500000E0119032D0d0
1.0 62.7 64.7 46.5 0.0 6.5 0.0 102.7 0 83
c4t500000E011903350d0
3.0 59.6 194.1 44.0 0.0 6.5 0.0 103.6 0 82
c4t500000E011903370d0
27.3 37.4 1429.5 207.2 0.0 4.6 0.0 70.7 0 80
c4t500000E01192B150d0
1.0 0.0 64.7 0.0 0.0 0.1 0.0 80.8 0 8
c4t500000E01192B2F0d0
1.0 0.0 64.7 0.0 0.0 0.1 0.0 83.2 0 8
c4t500000E0118ABA70d0
2.0 0.0 129.4 0.0 0.0 0.2 0.0 96.4 0 19
c4t500000E01192B390d0
30.3 44.5 1623.6 212.8 0.0 4.9 0.0 65.8 0 96
c4t500000E01192B3B0d0
28.3 41.4 1494.2 211.3 0.0 4.7 0.0 67.7 0 86
c4t500000E0118ABC50d0
2.0 0.0 129.4 0.0 0.0 0.2 0.0 94.1 0 19
c4t500000E0118B7B10d0
25.3 37.4 1417.4 209.3 0.0 4.0 0.0 63.4 0 80
c4t500000E0119494D0d0
8.1 53.6 517.6 264.4 0.0 4.5 0.0 72.8 0 82
c4t500000E0118EDA10d0
0.0 77.8 0.0 78.3 0.0 35.0 0.0 449.6 0 100
c4t500000E011949570d0
22.2 37.4 1223.3 209.3 0.0 3.5 0.0 59.0 0 75
c4t500000E01194A620d0
0.0 77.8 0.0 77.8 0.0 35.0 0.0 449.6 0 100
c4t500000E01194A660d0
0.0 77.8 0.0 76.3 0.0 35.0 0.0 449.6 0 100
c4t500000E011949630d0
28.3 44.5 1611.5 211.8 0.0 4.4 0.0 60.3 0 95
c4t500000E01194A740d0
0.0 77.8 0.0 74.3 0.0 35.0 0.0 449.6 0 100
c4t500000E01194A780d0
0.0 77.8 0.0 76.3 0.0 35.0 0.0 449.6 0 100
c4t500000E01194A720d0
1.0 0.0 64.7 0.0 0.0 0.1 0.0 77.5 0 8
c4t500000E01194A760d0
2.0 0.0 129.4 0.0 0.0 0.2 0.0 91.1 0 18
c4t500000E01194A8A0d0
0.0 77.8 0.0 74.3 0.0 35.0 0.0 449.6 0 100
c4t500000E01194A8C0d0
0.0 0.0 0.0 0.0 0.0 2.0 0.0 0.0 0 100
c4t500000E01194A840d0
^C
bash-3.00#
All pools have atime set to off, and sharenfs is set.
Other than that rest parameters are default.
bash-3.00# zfs list
NAME USED AVAIL REFER MOUNTPOINT
p1 752G 51.6G 53K /p1
p1/d5201 383G 17.0G 383G /p1/d5201
p1/d5202 368G 31.5G 368G /p1/d5202
p2 738G 65.4G 53K /p2
p2/d5203 376G 24.2G 376G /p2/d5203
p2/d5204 362G 38.1G 362G /p2/d5204
p3 733G 70.3G 53K /p3
p3/d5205 366G 33.8G 366G /p3/d5205
p3/d5206 367G 33.4G 367G /p3/d5206
p4 665G 71.1G 53K /p4
p4/d5207 328G 71.1G 328G /p4/d5207
p4/d5208 337G 62.9G 337G /p4/d5208
p5 704G 32.2G 53K /p5
p5/d5209 310G 32.2G 310G /p5/d5209
p5/d5210 393G 6.52G 393G /p5/d5210
p6 697G 39.5G 53K /p6
p6/d5211 394G 5.76G 394G /p6/d5211
p6/d5212 302G 39.5G 302G /p6/d5212
bash-3.00#
bash-3.00# zpool status
pool: p1
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
p1 ONLINE 0 0 0
raidz ONLINE 0 0 0
c4t500000E011909320d0 ONLINE 0 0 0
c4t500000E011909300d0 ONLINE 0 0 0
c4t500000E011903030d0 ONLINE 0 0 0
c4t500000E011903300d0 ONLINE 0 0 0
c4t500000E0119091E0d0 ONLINE 0 0 0
c4t500000E0119032D0d0 ONLINE 0 0 0
c4t500000E011903370d0 ONLINE 0 0 0
c4t500000E011903190d0 ONLINE 0 0 0
c4t500000E011903350d0 ONLINE 0 0 0
c4t500000E0119095A0d0 ONLINE 0 0 0
c4t500000E0119032A0d0 ONLINE 0 0 0
c4t500000E011903340d0 ONLINE 0 0 0
errors: No known data errors
pool: p2
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
p2 ONLINE 0 0 0
raidz ONLINE 0 0 0
c4t500000E011902FB0d0 ONLINE 0 0 0
c4t500000E0119030F0d0 ONLINE 0 0 0
c4t500000E01190E730d0 ONLINE 0 0 0
c4t500000E01190E7F0d0 ONLINE 0 0 0
c4t500000E011903120d0 ONLINE 0 0 0
c4t500000E01190E750d0 ONLINE 0 0 0
c4t500000E0119031B0d0 ONLINE 0 0 0
c4t500000E0119030D0d0 ONLINE 0 0 0
c4t500000E011903260d0 ONLINE 0 0 0
c4t500000E011903320d0 ONLINE 0 0 0
c4t500000E011902FA0d0 ONLINE 0 0 0
c4t500000E01190E6D0d0 ONLINE 0 0 0
errors: No known data errors
pool: p3
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
p3 ONLINE 0 0 0
raidz ONLINE 0 0 0
c4t500000E01194A620d0 ONLINE 0 0 0
c4t500000E0119494D0d0 ONLINE 0 0 0
c4t500000E0118ABC50d0 ONLINE 0 0 0
c4t500000E0118B0390d0 ONLINE 0 0 0
c4t500000E01194A610d0 ONLINE 0 0 0
c4t500000E01194A740d0 ONLINE 0 0 0
c4t500000E01192B3B0d0 ONLINE 0 0 0
c4t500000E0118C3230d0 ONLINE 0 0 0
c4t500000E0118AC370d0 ONLINE 0 0 0
c4t500000E01192B420d0 ONLINE 0 0 0
c4t500000E01192B540d0 ONLINE 0 0 0
c4t500000E01192B150d0 ONLINE 0 0 0
errors: No known data errors
pool: p4
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
p4 ONLINE 0 0 0
raidz ONLINE 0 0 0
c4t500000E01192B2F0d0 ONLINE 0 0 0
c4t500000E01194A760d0 ONLINE 0 0 0
c4t500000E01192B290d0 ONLINE 0 0 0
c4t500000E011C19D60d0 ONLINE 0 0 0
c4t500000E0118C3220d0 ONLINE 0 0 0
c4t500000E0118ABA70d0 ONLINE 0 0 0
c4t500000E01194A6F0d0 ONLINE 0 0 0
c4t500000E01192B390d0 ONLINE 0 0 0
c4t500000E01194A840d0 ONLINE 0 0 0
c4t500000E0118B7B10d0 ONLINE 0 0 0
c4t500000E01194A8A0d0 ONLINE 0 0 0
errors: No known data errors
pool: p5
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
p5 ONLINE 0 0 0
raidz ONLINE 0 0 0
c4t500000E0118EDCC0d0 ONLINE 0 0 0
c4t500000E0118EDCA0d0 ONLINE 0 0 0
c4t500000E0118F2060d0 ONLINE 0 0 0
c4t500000E0118F2350d0 ONLINE 0 0 0
c4t500000E0118F2180d0 ONLINE 0 0 0
c4t500000E0118F2190d0 ONLINE 0 0 0
c4t500000E0118EDB20d0 ONLINE 0 0 0
c4t500000E0118EDA10d0 ONLINE 0 0 0
c4t500000E01190E6B0d0 ONLINE 0 0 0
c4t500000E0118F21C0d0 ONLINE 0 0 0
c4t500000E0118F1FD0d0 ONLINE 0 0 0
errors: No known data errors
pool: p6
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
p6 ONLINE 0 0 0
raidz ONLINE 0 0 0
c4t500000E01194A810d0 ONLINE 0 0 0
c4t500000E01194A780d0 ONLINE 0 0 0
c4t500000E01194A710d0 ONLINE 0 0 0
c4t500000E011949630d0 ONLINE 0 0 0
c4t500000E01194A730d0 ONLINE 0 0 0
c4t500000E01194A660d0 ONLINE 0 0 0
c4t500000E0119495A0d0 ONLINE 0 0 0
c4t500000E01194A720d0 ONLINE 0 0 0
c4t500000E01194A750d0 ONLINE 0 0 0
c4t500000E01194A8C0d0 ONLINE 0 0 0
c4t500000E011949570d0 ONLINE 0 0 0
errors: No known data errors
bash-3.00#
--
Best regards,
Robert mailto:rmilkowski at task.gda.pl
http://milek.blogspot.com
Hello Robert,
Wednesday, May 31, 2006, 12:22:34 AM, you wrote:
RM> Hello zfs-discuss,
RM> I have an nfs server with zfs as a local file server.
RM> System is snv_39 on SPARC.
RM> There are 6 raid-z pools (p1-p6).
RM> The problem is that I do not see any heavy traffic on network
RM> interfaces nor using zpool iostat. However using just old iostat I can
RM> see MUCH more traffic going to local disks. The difference is someting
RM> like 10x - zpool iostat shows for example ~6MB/s of reads however
RM> iostat shows ~50MB/s. The question is who''s lying?
RM> As server is behaving not that good in regards to performance I
RM> suspect iostat is more accurate. Or maybe zpool iostat shows only
RM> ''application data'' being transferred while iostat shows
''real'' IOs to
RM> disks - would there be that big difference (checksums, what else?)???
RM> On the other hand when I look at how much traffic is on network
RM> interfaces it''s much closer to what I see using zpool iostat.
RM> So maybe zfs introduces that much overhead after all and zpool iostats
RM> shows app data being tranfered.
RM> Clients mount resources using NFSv3 over TCP.
RM> nfsd is set to have 2048 threads - all are utilized most of the day.
RM> Below iostat and zpool iostat output - both run at the same time in
RM> different terminals.
RM> bash-3.00# iostat -xnzC 1 | egrep "devic| c4$"
RM> extended device statistics
RM> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
RM> 1095.5 1390.7 64252.7 13742.5 0.0 137.5 0.0 55.3 0 1051 c4
RM> 470.2 4394.0 28882.4 3462.0 0.0 748.7 0.0 153.9 0 5388 c4
RM> 893.7 3262.0 55206.1 3124.8 0.0 680.4 0.0 163.7 0 6391 c4
RM> 965.6 3043.7 61801.2 2727.4 0.0 358.0 0.0 89.3 0 5119 c4
RM> 1162.8 2422.9 73277.0 5953.1 0.0 506.9 0.0 141.4 0 5390 c4
RM> 1693.1 1292.4 98599.2 1806.1 0.0 538.3 0.0 180.3 0 5204 c4
RM> 1551.7 1343.3 99808.4 1142.5 0.0 899.3 0.0 310.6 0 6300 c4
RM> 624.2 4002.8 39899.0 3435.7 0.0 429.7 0.0 92.9 0 4048 c4
RM> 1017.1 2735.7 65866.1 5809.7 0.0 325.9 0.0 86.8 0 4425 c4
RM> 1038.9 2817.9 66914.1 4276.2 0.0 212.3 0.0 55.0 0 4241 c4
RM> 784.4 3410.0 48851.0 9078.4 0.0 349.9 0.0 83.4 0 4579 c4
RM> 732.3 3542.8 46408.7 8075.1 0.0 526.4 0.0 123.1 0 4075 c4
RM> 928.1 3108.3 54917.9 7490.8 0.0 811.8 0.0 201.1 0 5750 c4
RM> 931.0 2943.1 55627.1 10331.7 0.0 846.0 0.0 218.4 0 5795 c4
RM> ^C
RM> bash-3.00#
RM> bash-3.00# zpool iostat 1
RM> capacity operations bandwidth
RM> pool used avail read write read write
RM> ---------- ----- ----- ----- ----- ----- -----
RM> p1 751G 64.6G 79 72 1.15M 920K
RM> p2 738G 78.2G 64 92 1.06M 1.27M
RM> p3 733G 83.1G 61 98 1.12M 1.28M
RM> p4 665G 82.7G 5 11 51.8K 55.4K
RM> p5 704G 43.9G 80 61 1.09M 873K
RM> p6 697G 51.2G 73 67 1.04M 935K
RM> ---------- ----- ----- ----- ----- ----- -----
RM> p1 751G 64.6G 13 128 276K 767K
RM> p2 738G 78.2G 16 129 1.47M 704K
RM> p3 733G 83.1G 10 192 388K 683K
RM> p4 665G 82.7G 16 3 37.1K 5.24K
RM> p5 704G 43.9G 11 172 34.2K 617K
RM> p6 697G 51.2G 12 35 31.4K 140K
RM> ---------- ----- ----- ----- ----- ----- -----
RM> p1 751G 64.6G 5 87 27.2K 739K
RM> p2 738G 78.2G 15 93 39.6K 391K
RM> p3 733G 83.1G 15 151 51.5K 298K
RM> p4 665G 82.7G 73 27 1.07M 118K
RM> p5 704G 43.9G 41 62 1.85M 317K
RM> p6 697G 51.2G 16 152 75.8K 879K
RM> ---------- ----- ----- ----- ----- ----- -----
RM> p1 751G 64.6G 39 83 211K 505K
RM> p2 738G 78.2G 27 76 562K 396K
RM> p3 733G 83.1G 38 77 109K 276K
RM> p4 665G 82.7G 0 1 0 6.67K
RM> p5 704G 43.9G 30 78 83.4K 596K
RM> p6 697G 51.2G 29 85 110K 702K
RM> ---------- ----- ----- ----- ----- ----- -----
RM> p1 751G 64.6G 394 39 1018K 3.09M
RM> p2 738G 78.2G 12 157 29.0K 274K
RM> p3 733G 83.1G 2 109 12.8K 844K
RM> p4 665G 82.7G 3 4 14.3K 20.0K
RM> p5 704G 43.9G 32 44 85.2K 527K
RM> p6 697G 51.2G 62 47 3.93M 365K
RM> ---------- ----- ----- ----- ----- ----- -----
RM> p1 751G 64.6G 159 0 421K 3.81K
RM> p2 738G 78.2G 18 86 174K 407K
RM> p3 733G 83.1G 28 89 121K 230K
RM> p4 665G 82.7G 94 17 7.27M 43.3K
RM> p5 704G 43.9G 25 0 225K 0
RM> p6 697G 51.2G 80 28 7.10M 810K
RM> ---------- ----- ----- ----- ----- ----- -----
RM> p1 751G 64.6G 287 4 736K 34.2K
RM> p2 738G 78.2G 2 81 8.06K 389K
RM> p3 733G 83.1G 9 57 19.0K 493K
RM> p4 665G 82.7G 62 17 5.38M 70.6K
RM> p5 704G 43.9G 28 18 315K 152K
RM> p6 697G 51.2G 70 3 7.26M 133K
RM> ---------- ----- ----- ----- ----- ----- -----
RM> p1 751G 64.6G 24 20 75.3K 239K
RM> p2 738G 78.2G 36 228 576K 477K
RM> p3 733G 83.1G 0 220 4.74K 662K
RM> p4 665G 82.7G 33 12 323K 35.1K
RM> p5 704G 43.9G 9 311 26.6K 1.28M
RM> p6 697G 51.2G 31 2 87.4K 11.4K
RM> ---------- ----- ----- ----- ----- ----- -----
RM> p1 751G 64.6G 28 1 148K 11.4K
RM> p2 738G 78.2G 47 109 1.42M 1.11M
RM> p3 733G 83.1G 24 243 73.3K 661K
RM> p4 665G 82.7G 0 0 4.28K 0
RM> p5 704G 43.9G 32 234 95.6K 1.32M
RM> p6 697G 51.2G 66 24 177K 2.23M
RM> ---------- ----- ----- ----- ----- ----- -----
RM> p1 751G 64.6G 12 6 29.4K 64.5K
RM> p2 738G 78.2G 27 98 80.1K 1.46M
RM> p3 733G 83.1G 21 71 171K 795K
RM> p4 665G 82.7G 58 5 2.77M 19.4K
RM> p5 704G 43.9G 23 92 61.1K 470K
RM> p6 697G 51.2G 209 9 561K 428K
RM> ---------- ----- ----- ----- ----- ----- -----
RM> p1 751G 64.6G 18 1 56.7K 125K
RM> p2 738G 78.2G 17 76 48.2K 4.76M
RM> p3 733G 83.1G 29 95 88.5K 1.07M
RM> p4 665G 82.7G 10 112 146K 197K
RM> p5 704G 43.9G 16 368 45.5K 815K
RM> p6 697G 51.2G 26 129 79.6K 844K
RM> ---------- ----- ----- ----- ----- ----- -----
RM> p1 751G 64.6G 28 49 76.2K 4.49M
RM> p2 738G 78.2G 37 9 107K 209K
RM> p3 733G 83.1G 27 138 198K 938K
RM> p4 665G 82.7G 17 164 415K 258K
RM> p5 704G 43.9G 7 223 29.9K 930K
RM> p6 697G 51.2G 21 132 905K 458K
RM> ---------- ----- ----- ----- ----- ----- -----
RM> ^C
RM> bash-3.00#
RM> Example full iostat output (all disks, iostat -xnz 1)
RM> extended device statistics
RM> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
RM> 1032.2 2751.8 64054.4 7491.7 0.0 682.4 0.0 180.3 0 5182 c4
RM> 21.2 45.5 1158.6 214.8 0.0 3.6 0.0 54.0 0 84
c4t500000E0118AC370d0
RM> 28.3 38.4 1494.2 208.8 0.0 4.8 0.0 71.7 0 84
c4t500000E0118B0390d0
RM> 11.1 64.7 711.7 302.8 0.0 10.9 0.0 143.2 0 100
c4t500000E0118F1FD0d0
RM> 1.0 0.0 64.7 0.0 0.0 0.1 0.0 74.5 0 8
c4t500000E011C19D60d0
RM> 1.0 0.0 64.7 0.0 0.0 0.1 0.0 76.2 0 8
c4t500000E0118C3220d0
RM> 47.5 2.0 3105.7 35.4 0.0 6.2 0.0 125.6 0 71
c4t500000E011902FA0d0
RM> 8.1 64.7 517.6 296.7 0.0 9.9 0.0 135.8 0 100
c4t500000E0118F2190d0
RM> 0.0 62.7 0.0 45.5 0.0 6.4 0.0 102.9 0 81
c4t500000E0119091E0d0
RM> 54.6 3.0 3558.6 35.4 0.0 8.3 0.0 143.2 0 80
c4t500000E011903120d0
RM> 10.1 61.7 647.0 271.4 0.0 6.6 0.0 92.2 0 94
c4t500000E0118F2350d0
RM> 1.0 61.7 64.7 45.5 0.0 6.4 0.0 101.8 0 84
c4t500000E0119032A0d0
RM> 47.5 2.0 3170.4 35.4 0.0 6.6 0.0 133.8 0 70
c4t500000E011903260d0
RM> 2.0 64.7 129.4 48.0 0.0 6.5 0.0 97.9 0 91
c4t500000E011909320d0
RM> 0.0 41.4 0.0 35.9 0.0 4.5 0.0 108.5 0 55
c4t500000E011903300d0
RM> 2.0 63.7 129.4 47.5 0.0 6.5 0.0 99.5 0 87
c4t500000E011909300d0
RM> 2.0 43.5 129.4 36.4 0.0 4.6 0.0 102.2 0 62
c4t500000E011903340d0
RM> 45.5 2.0 3041.0 35.4 0.0 6.4 0.0 135.0 0 70
c4t500000E011903320d0
RM> 1.0 62.7 64.7 46.5 0.0 6.4 0.0 100.8 0 84
c4t500000E0119095A0d0
RM> 22.2 45.5 1223.3 214.8 0.0 3.6 0.0 52.6 0 86
c4t500000E01192B420d0
RM> 37.4 2.0 2523.4 35.4 0.0 4.5 0.0 115.0 0 59
c4t500000E01190E6D0d0
RM> 9.1 55.6 582.3 283.6 0.0 5.3 0.0 81.4 0 86
c4t500000E01190E6B0d0
RM> 57.6 2.0 3752.7 35.4 0.0 8.9 0.0 149.6 0 82
c4t500000E01190E750d0
RM> 43.5 2.0 2846.9 35.4 0.0 5.3 0.0 116.4 0 66
c4t500000E01190E7F0d0
RM> 56.6 2.0 3752.7 35.4 0.0 8.4 0.0 144.0 0 81
c4t500000E01190E730d0
RM> 29.3 44.5 1558.9 212.8 0.0 5.0 0.0 67.1 0 96
c4t500000E01192B540d0
RM> 9.1 65.7 582.3 313.9 0.0 11.2 0.0 149.4 0 100
c4t500000E0118EDB20d0
RM> 0.0 78.9 0.0 75.3 0.0 35.0 0.0 443.8 0 100
c4t500000E0119495A0d0
RM> 1.0 0.0 64.7 0.0 0.0 0.1 0.0 85.8 0 9
c4t500000E01194A6F0d0
RM> 28.3 46.5 1611.5 213.8 0.0 5.0 0.0 67.0 0 97
c4t500000E01194A610d0
RM> 10.1 63.7 711.7 300.3 0.0 8.5 0.0 114.9 0 100
c4t500000E0118EDCC0d0
RM> 12.1 61.7 776.4 300.8 0.0 11.0 0.0 149.7 0 100
c4t500000E0118EDCA0d0
RM> 0.0 78.9 0.0 76.3 0.0 35.0 0.0 443.8 0 100
c4t500000E01194A750d0
RM> 0.0 78.9 0.0 76.3 0.0 35.0 0.0 443.8 0 100
c4t500000E01194A710d0
RM> 0.0 78.9 0.0 79.4 0.0 35.0 0.0 443.8 0 100
c4t500000E01194A730d0
RM> 0.0 78.9 0.0 75.8 0.0 35.0 0.0 443.8 0 100
c4t500000E01194A810d0
RM> 25.3 42.5 1300.1 208.3 0.0 4.4 0.0 64.3 0 85
c4t500000E0118C3230d0
RM> 9.1 58.6 582.3 262.8 0.0 5.9 0.0 86.4 0 88
c4t500000E0118F2060d0
RM> 42.5 3.0 2782.2 35.4 0.0 5.7 0.0 125.7 0 66
c4t500000E011902FB0d0
RM> 43.5 2.0 2846.9 35.4 0.0 5.3 0.0 115.7 0 65
c4t500000E0119030D0d0
RM> 2.0 64.7 129.4 47.5 0.0 6.7 0.0 99.9 0 87
c4t500000E011903030d0
RM> 11.1 61.7 711.7 303.3 0.0 9.6 0.0 131.7 0 98
c4t500000E0118F21C0d0
RM> 7.1 59.6 452.9 270.9 0.0 5.2 0.0 78.4 0 90
c4t500000E0118F2180d0
RM> 43.5 2.0 2846.9 35.4 0.0 5.6 0.0 122.3 0 68
c4t500000E0119030F0d0
RM> 60.7 2.0 3946.8 35.4 0.0 10.1 0.0 160.7 0 85
c4t500000E0119031B0d0
RM> 1.0 41.4 64.7 33.9 0.0 4.6 0.0 108.7 0 56
c4t500000E011903190d0
RM> 3.0 58.6 194.1 43.5 0.0 6.4 0.0 104.1 0 80
c4t500000E0119032D0d0
RM> 1.0 62.7 64.7 46.5 0.0 6.5 0.0 102.7 0 83
c4t500000E011903350d0
RM> 3.0 59.6 194.1 44.0 0.0 6.5 0.0 103.6 0 82
c4t500000E011903370d0
RM> 27.3 37.4 1429.5 207.2 0.0 4.6 0.0 70.7 0 80
c4t500000E01192B150d0
RM> 1.0 0.0 64.7 0.0 0.0 0.1 0.0 80.8 0 8
c4t500000E01192B2F0d0
RM> 1.0 0.0 64.7 0.0 0.0 0.1 0.0 83.2 0 8
c4t500000E0118ABA70d0
RM> 2.0 0.0 129.4 0.0 0.0 0.2 0.0 96.4 0 19
c4t500000E01192B390d0
RM> 30.3 44.5 1623.6 212.8 0.0 4.9 0.0 65.8 0 96
c4t500000E01192B3B0d0
RM> 28.3 41.4 1494.2 211.3 0.0 4.7 0.0 67.7 0 86
c4t500000E0118ABC50d0
RM> 2.0 0.0 129.4 0.0 0.0 0.2 0.0 94.1 0 19
c4t500000E0118B7B10d0
RM> 25.3 37.4 1417.4 209.3 0.0 4.0 0.0 63.4 0 80
c4t500000E0119494D0d0
RM> 8.1 53.6 517.6 264.4 0.0 4.5 0.0 72.8 0 82
c4t500000E0118EDA10d0
RM> 0.0 77.8 0.0 78.3 0.0 35.0 0.0 449.6 0 100
c4t500000E011949570d0
RM> 22.2 37.4 1223.3 209.3 0.0 3.5 0.0 59.0 0 75
c4t500000E01194A620d0
RM> 0.0 77.8 0.0 77.8 0.0 35.0 0.0 449.6 0 100
c4t500000E01194A660d0
RM> 0.0 77.8 0.0 76.3 0.0 35.0 0.0 449.6 0 100
c4t500000E011949630d0
RM> 28.3 44.5 1611.5 211.8 0.0 4.4 0.0 60.3 0 95
c4t500000E01194A740d0
RM> 0.0 77.8 0.0 74.3 0.0 35.0 0.0 449.6 0 100
c4t500000E01194A780d0
RM> 0.0 77.8 0.0 76.3 0.0 35.0 0.0 449.6 0 100
c4t500000E01194A720d0
RM> 1.0 0.0 64.7 0.0 0.0 0.1 0.0 77.5 0 8
c4t500000E01194A760d0
RM> 2.0 0.0 129.4 0.0 0.0 0.2 0.0 91.1 0 18
c4t500000E01194A8A0d0
RM> 0.0 77.8 0.0 74.3 0.0 35.0 0.0 449.6 0 100
c4t500000E01194A8C0d0
RM> 0.0 0.0 0.0 0.0 0.0 2.0 0.0 0.0 0 100
c4t500000E01194A840d0
RM> ^C
RM> bash-3.00#
RM> All pools have atime set to off, and sharenfs is set.
RM> Other than that rest parameters are default.
RM> bash-3.00# zfs list
RM> NAME USED AVAIL REFER MOUNTPOINT
RM> p1 752G 51.6G 53K /p1
RM> p1/d5201 383G 17.0G 383G /p1/d5201
RM> p1/d5202 368G 31.5G 368G /p1/d5202
RM> p2 738G 65.4G 53K /p2
RM> p2/d5203 376G 24.2G 376G /p2/d5203
RM> p2/d5204 362G 38.1G 362G /p2/d5204
RM> p3 733G 70.3G 53K /p3
RM> p3/d5205 366G 33.8G 366G /p3/d5205
RM> p3/d5206 367G 33.4G 367G /p3/d5206
RM> p4 665G 71.1G 53K /p4
RM> p4/d5207 328G 71.1G 328G /p4/d5207
RM> p4/d5208 337G 62.9G 337G /p4/d5208
RM> p5 704G 32.2G 53K /p5
RM> p5/d5209 310G 32.2G 310G /p5/d5209
RM> p5/d5210 393G 6.52G 393G /p5/d5210
RM> p6 697G 39.5G 53K /p6
RM> p6/d5211 394G 5.76G 394G /p6/d5211
RM> p6/d5212 302G 39.5G 302G /p6/d5212
RM> bash-3.00#
RM> bash-3.00# zpool status
RM> pool: p1
RM> state: ONLINE
RM> scrub: none requested
RM> config:
RM> NAME STATE READ WRITE CKSUM
RM> p1 ONLINE 0 0 0
RM> raidz ONLINE 0 0 0
RM> c4t500000E011909320d0 ONLINE 0 0 0
RM> c4t500000E011909300d0 ONLINE 0 0 0
RM> c4t500000E011903030d0 ONLINE 0 0 0
RM> c4t500000E011903300d0 ONLINE 0 0 0
RM> c4t500000E0119091E0d0 ONLINE 0 0 0
RM> c4t500000E0119032D0d0 ONLINE 0 0 0
RM> c4t500000E011903370d0 ONLINE 0 0 0
RM> c4t500000E011903190d0 ONLINE 0 0 0
RM> c4t500000E011903350d0 ONLINE 0 0 0
RM> c4t500000E0119095A0d0 ONLINE 0 0 0
RM> c4t500000E0119032A0d0 ONLINE 0 0 0
RM> c4t500000E011903340d0 ONLINE 0 0 0
RM> errors: No known data errors
RM> pool: p2
RM> state: ONLINE
RM> scrub: none requested
RM> config:
RM> NAME STATE READ WRITE CKSUM
RM> p2 ONLINE 0 0 0
RM> raidz ONLINE 0 0 0
RM> c4t500000E011902FB0d0 ONLINE 0 0 0
RM> c4t500000E0119030F0d0 ONLINE 0 0 0
RM> c4t500000E01190E730d0 ONLINE 0 0 0
RM> c4t500000E01190E7F0d0 ONLINE 0 0 0
RM> c4t500000E011903120d0 ONLINE 0 0 0
RM> c4t500000E01190E750d0 ONLINE 0 0 0
RM> c4t500000E0119031B0d0 ONLINE 0 0 0
RM> c4t500000E0119030D0d0 ONLINE 0 0 0
RM> c4t500000E011903260d0 ONLINE 0 0 0
RM> c4t500000E011903320d0 ONLINE 0 0 0
RM> c4t500000E011902FA0d0 ONLINE 0 0 0
RM> c4t500000E01190E6D0d0 ONLINE 0 0 0
RM> errors: No known data errors
RM> pool: p3
RM> state: ONLINE
RM> scrub: none requested
RM> config:
RM> NAME STATE READ WRITE CKSUM
RM> p3 ONLINE 0 0 0
RM> raidz ONLINE 0 0 0
RM> c4t500000E01194A620d0 ONLINE 0 0 0
RM> c4t500000E0119494D0d0 ONLINE 0 0 0
RM> c4t500000E0118ABC50d0 ONLINE 0 0 0
RM> c4t500000E0118B0390d0 ONLINE 0 0 0
RM> c4t500000E01194A610d0 ONLINE 0 0 0
RM> c4t500000E01194A740d0 ONLINE 0 0 0
RM> c4t500000E01192B3B0d0 ONLINE 0 0 0
RM> c4t500000E0118C3230d0 ONLINE 0 0 0
RM> c4t500000E0118AC370d0 ONLINE 0 0 0
RM> c4t500000E01192B420d0 ONLINE 0 0 0
RM> c4t500000E01192B540d0 ONLINE 0 0 0
RM> c4t500000E01192B150d0 ONLINE 0 0 0
RM> errors: No known data errors
RM> pool: p4
RM> state: ONLINE
RM> scrub: none requested
RM> config:
RM> NAME STATE READ WRITE CKSUM
RM> p4 ONLINE 0 0 0
RM> raidz ONLINE 0 0 0
RM> c4t500000E01192B2F0d0 ONLINE 0 0 0
RM> c4t500000E01194A760d0 ONLINE 0 0 0
RM> c4t500000E01192B290d0 ONLINE 0 0 0
RM> c4t500000E011C19D60d0 ONLINE 0 0 0
RM> c4t500000E0118C3220d0 ONLINE 0 0 0
RM> c4t500000E0118ABA70d0 ONLINE 0 0 0
RM> c4t500000E01194A6F0d0 ONLINE 0 0 0
RM> c4t500000E01192B390d0 ONLINE 0 0 0
RM> c4t500000E01194A840d0 ONLINE 0 0 0
RM> c4t500000E0118B7B10d0 ONLINE 0 0 0
RM> c4t500000E01194A8A0d0 ONLINE 0 0 0
RM> errors: No known data errors
RM> pool: p5
RM> state: ONLINE
RM> scrub: none requested
RM> config:
RM> NAME STATE READ WRITE CKSUM
RM> p5 ONLINE 0 0 0
RM> raidz ONLINE 0 0 0
RM> c4t500000E0118EDCC0d0 ONLINE 0 0 0
RM> c4t500000E0118EDCA0d0 ONLINE 0 0 0
RM> c4t500000E0118F2060d0 ONLINE 0 0 0
RM> c4t500000E0118F2350d0 ONLINE 0 0 0
RM> c4t500000E0118F2180d0 ONLINE 0 0 0
RM> c4t500000E0118F2190d0 ONLINE 0 0 0
RM> c4t500000E0118EDB20d0 ONLINE 0 0 0
RM> c4t500000E0118EDA10d0 ONLINE 0 0 0
RM> c4t500000E01190E6B0d0 ONLINE 0 0 0
RM> c4t500000E0118F21C0d0 ONLINE 0 0 0
RM> c4t500000E0118F1FD0d0 ONLINE 0 0 0
RM> errors: No known data errors
RM> pool: p6
RM> state: ONLINE
RM> scrub: none requested
RM> config:
RM> NAME STATE READ WRITE CKSUM
RM> p6 ONLINE 0 0 0
RM> raidz ONLINE 0 0 0
RM> c4t500000E01194A810d0 ONLINE 0 0 0
RM> c4t500000E01194A780d0 ONLINE 0 0 0
RM> c4t500000E01194A710d0 ONLINE 0 0 0
RM> c4t500000E011949630d0 ONLINE 0 0 0
RM> c4t500000E01194A730d0 ONLINE 0 0 0
RM> c4t500000E01194A660d0 ONLINE 0 0 0
RM> c4t500000E0119495A0d0 ONLINE 0 0 0
RM> c4t500000E01194A720d0 ONLINE 0 0 0
RM> c4t500000E01194A750d0 ONLINE 0 0 0
RM> c4t500000E01194A8C0d0 ONLINE 0 0 0
RM> c4t500000E011949570d0 ONLINE 0 0 0
RM> errors: No known data errors
RM> bash-3.00#
bash-3.00# fsstat zfs 1
[...]
new name name attr attr lookup rddir read read write write
file remov chng get set ops ops ops bytes ops bytes
10 12 8 919 7 102 0 32 975K 26 652K zfs
6 21 10 1.22K 1 123 0 205 6.23M 4 33.5K zfs
14 26 3 1.14K 9 127 0 46 1.33M 5 60.1K zfs
13 11 8 1.02K 7 102 0 43 1.24M 22 514K zfs
10 17 10 998 6 87 0 31 746K 85 2.45M zfs
11 15 3 915 24 93 0 60 1.86M 6 54.3K zfs
7 31 19 1.82K 5 167 0 23 636K 278 8.22M zfs
14 22 13 1.44K 10 104 0 31 992K 257 7.84M zfs
5 18 5 1.16K 4 80 0 26 764K 262 8.06M zfs
1 19 6 572 2 75 0 19 579K 3 20.6K zfs
^C
and iostat and the same time:
bash-3.00# iostat -xnzC 1|egrep "devic| c4$"
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1094.7 1395.7 64212.0 13725.1 0.1 138.5 0.0 55.6 0 1060 c4
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
707.5 3585.5 44721.8 8486.7 0.0 583.7 0.0 136.0 0 5787 c4
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
871.1 3186.4 55818.3 3175.2 0.0 944.1 0.0 232.7 0 6533 c4
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1117.7 2557.8 72886.6 2516.0 0.0 748.7 0.0 203.7 0 6290 c4
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
843.8 3277.9 54130.4 4772.4 0.0 723.7 0.0 175.6 0 6532 c4
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1285.1 2339.0 76506.6 2233.9 0.0 771.5 0.0 212.9 0 6626 c4
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
849.8 3340.9 54389.5 3133.2 0.0 513.4 0.0 122.5 0 6212 c4
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
834.2 3358.8 53391.5 4774.1 0.0 640.4 0.0 152.7 0 6216 c4
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
905.4 3115.2 59024.8 3698.9 0.0 588.2 0.0 146.3 0 5078 c4
--
Best regards,
Robert mailto:rmilkowski at task.gda.pl
http://milek.blogspot.com
That''s interesting - ''zpool iostat'' shows quite small
read volume to any pool however if I run ''zpool iostat -v''
then I can see that while read volume to a pool is small, read volume to each
disk is actually quite large so in summary I get over 10x read volume if I sum
all disks in a pool than on pool itself. These data are consistent with iostat.
So now even zpool claims that it actually issues 10x (and more) read volume to
all disks in a pool than to pool itself.
Now - why???? It really hits performance here...
bash-3.00# zpool iostat -v p1 1
capacity operations bandwidth
pool used avail read write read write
------------------------- ----- ----- ----- ----- ----- -----
p1 749G 67.2G 58 90 878K 903K
raidz 749G 67.2G 58 90 878K 903K
c4t500000E011909320d0 - - 15 40 959K 87.3K
c4t500000E011909300d0 - - 14 40 929K 86.5K
c4t500000E011903030d0 - - 18 40 1.11M 86.8K
c4t500000E011903300d0 - - 13 32 823K 77.7K
c4t500000E0119091E0d0 - - 15 40 961K 87.3K
c4t500000E0119032D0d0 - - 14 40 930K 86.5K
c4t500000E011903370d0 - - 18 40 1.11M 86.8K
c4t500000E011903190d0 - - 13 32 828K 77.8K
c4t500000E011903350d0 - - 15 40 964K 87.3K
c4t500000E0119095A0d0 - - 14 40 934K 86.5K
c4t500000E0119032A0d0 - - 18 40 1.11M 86.8K
c4t500000E011903340d0 - - 13 32 821K 77.7K
------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth
pool used avail read write read write
------------------------- ----- ----- ----- ----- ----- -----
p1 749G 67.2G 49 44 897K 1.02M
raidz 749G 67.2G 49 44 897K 1.02M
c4t500000E011909320d0 - - 17 25 1.05M 96.4K
c4t500000E011909300d0 - - 15 25 972K 96.2K
c4t500000E011903030d0 - - 20 25 1.25M 96.3K
c4t500000E011903300d0 - - 14 25 853K 91.2K
c4t500000E0119091E0d0 - - 16 25 1017K 96.7K
c4t500000E0119032D0d0 - - 15 25 955K 96.7K
c4t500000E011903370d0 - - 19 25 1.21M 96.6K
c4t500000E011903190d0 - - 13 25 843K 91.0K
c4t500000E011903350d0 - - 16 25 1001K 96.5K
c4t500000E0119095A0d0 - - 15 25 974K 96.3K
c4t500000E0119032A0d0 - - 20 25 1.22M 96.5K
c4t500000E011903340d0 - - 14 25 855K 90.7K
------------------------- ----- ----- ----- ----- ----- -----
^C
bash-3.00#
This message posted from opensolaris.org
Another pool - different array, different host, different workload.
And again - summay read throutput to all disks in a pool is 10x bigger than to a
pool itself.
Iny idea?
bash-3.00# zpool iostat -v 1
capacity operations bandwidth
pool used avail read write read write
-------------------------------------- ----- ----- ----- ----- ----- -----
nfs-s5-1 4.32T 16.1T 304 127 11.9M 506K
raidz 4.32T 16.1T 304 127 11.9M 506K
c4t600C0FF00000000009258F2411CF3D01d0 - - 148 48 8.48M
67.8K
c4t600C0FF00000000009258F6FA45D3801d0 - - 148 48 8.48M
67.9K
c4t600C0FF00000000009258F1820617F01d0 - - 146 48 8.46M
67.9K
c4t600C0FF00000000009258F24546FAC01d0 - - 146 48 8.45M
67.9K
c4t600C0FF00000000009258F5949030301d0 - - 146 48 8.46M
67.9K
c4t600C0FF00000000009258F24E8AADD01d0 - - 146 48 8.45M
67.9K
c4t600C0FF00000000009258F5FD5023B01d0 - - 146 48 8.46M
67.9K
c4t600C0FF00000000009258F17E7007801d0 - - 146 48 8.46M
67.9K
c4t600C0FF00000000009258F598F6BE701d0 - - 146 48 8.46M
67.8K
-------------------------------------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth
pool used avail read write read write
-------------------------------------- ----- ----- ----- ----- ----- -----
nfs-s5-1 4.32T 16.1T 508 72 34.5M 282K
raidz 4.32T 16.1T 508 72 34.5M 282K
c4t600C0FF00000000009258F2411CF3D01d0 - - 254 25 14.1M
37.6K
c4t600C0FF00000000009258F6FA45D3801d0 - - 248 24 13.8M
38.1K
c4t600C0FF00000000009258F1820617F01d0 - - 247 26 13.9M
37.6K
c4t600C0FF00000000009258F24546FAC01d0 - - 240 26 13.8M
37.8K
c4t600C0FF00000000009258F5949030301d0 - - 243 25 14.0M
37.3K
c4t600C0FF00000000009258F24E8AADD01d0 - - 246 26 13.8M
38.3K
c4t600C0FF00000000009258F5FD5023B01d0 - - 242 25 13.6M
38.6K
c4t600C0FF00000000009258F17E7007801d0 - - 238 27 13.5M
39.4K
c4t600C0FF00000000009258F598F6BE701d0 - - 258 27 14.6M
39.7K
-------------------------------------- ----- ----- ----- ----- ----- -----
^C
bash-3.00#
This message posted from opensolaris.org
There are a few related questions that I think you want answered. 1. How does RAID-Z effect performance? When using RAID-Z, each filesystem block is spread across (typically) all disks in the raid-z group. So to a first approximation, each raid-z group provides the iops of a single disk (but the bandwidth of N-1 disks). See Roch''s excellent article for a detailed explanation: http://blogs.sun.com/roller/page/roch?entry=when_to_and_not_to 2. Why does ''zpool iostat'' not report actual i/os? 3. Why are we doing so many read i/os? 4. Why are we reading so much data? ''zpool iostat'' reports the i/os that are seen at each level of the vdev tree, rather than the sum of the i/os that occur below that point in the vdev tree. This can provide some additional information when diagnosing performance problems. However, it is a bit counter-intuitive, so I always use iostat(1m). It may be clunky, but it does report on the actual i/os issued to the hardware. Also, I really like having the %busy reading, which ''zpool iostat'' does not provide. We are doing lots of read i/os because each block is spread out across all the disks in a raid-z group (as mentioned in Roch''s article). However, the "vdev cache" is causing us to issue many *fewer* i/os than would seem to be required, but reading much *more* data. For example, say we need to read a block of data. We''ll send the read down to the raid-z vdev. The raid-z vdev knows that it the data is spread out over its disks, so it (essentially) issues one read zio_t to each of the disk vdevs to retrieve the data. Now each of those disk vdevs will first look in its vdev cache. If it finds the data there, it returns it without ever actually issuing an i/o to the hardware. If it doesn''t find it there, it will issue a 64k i/o to the hardware, and put that 64k chunk into its vdev cache. Without the vdev cache, we would simply issue (Number of blocks to read) * (Number of disks in each raid-z vdev) read i/os to the hardware, and read the total number of bytes that you would expect, since each of those i/os would be for (approximately) 1/Ndisk bytes. However, with the vdev cache, we will issue fewer i/os, but read more data. 5. How can performance be improved? A. Use one big pool. Having 6 pools causes performance (and storage) to be stranded. When one filesystem is buiser than the others, it can only use the bandwidth and iops of its single raid-z vdev. If you had one big pool, that filesystem would be able to use all the disks in your system. B. Use smaller raid-z stripes. As Roch''s article explains, smaller raid-z stripes will provide more iops. We generally suggest 3 to 9 disks in each raid-z stripe. C. Use higher-performance disks. I''m not sure what the underlying storage you''re using is, but it''s pretty slow! As you can see from your per-disk iostat output, each device is only capable of 50-100 iops or 1-4MB/s, and takes on average over 100ms to service a request. If you are using some sort of hardware RAID enclosure, it may be working against you here. The perferred configuration would be to have each disk appear as a single device to the system. (This should be possible even with fancy RAID hardware.) So in conculsion, you can improve performance by creating one big pool with several raid-z stripes, each with 3 to 9 disks in it. These disks should be actual physical disks. Hope this helps, --matt ps. I''m drawing my conclusions based on the following data that you provided: On Wed, May 31, 2006 at 08:26:10AM -0700, Robert Milkowski wrote:> That''s interesting - ''zpool iostat'' shows quite small read volume to > any pool however if I run ''zpool iostat -v'' then I can see that while > read volume to a pool is small, read volume to each disk is actually > quite large so in summary I get over 10x read volume if I sum all > disks in a pool than on pool itself. These data are consistent with > iostat. So now even zpool claims that it actually issues 10x (and > more) read volume to all disks in a pool than to pool itself. > > Now - why???? It really hits performance here...> The problem is that I do not see any heavy traffic on network > interfaces nor using zpool iostat. However using just old iostat I can > see MUCH more traffic going to local disks. The difference is someting > like 10x - zpool iostat shows for example ~6MB/s of reads however > iostat shows ~50MB/s. The question is who''s lying?> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 57.6 2.0 3752.7 35.4 0.0 8.9 0.0 149.6 0 82 c4t...90E750d0 > 43.5 2.0 2846.9 35.4 0.0 5.3 0.0 116.4 0 66 c4t...90E7F0d0 > 56.6 2.0 3752.7 35.4 0.0 8.4 0.0 144.0 0 81 c4t...90E730d0 > 29.3 44.5 1558.9 212.8 0.0 5.0 0.0 67.1 0 96 c4t...92B540d0 > 9.1 65.7 582.3 313.9 0.0 11.2 0.0 149.4 0 100 c4t...8EDB20d0 > 0.0 78.9 0.0 75.3 0.0 35.0 0.0 443.8 0 100 c4t...9495A0d0> bash-3.00# fsstat zfs 1 > [...] > new name name attr attr lookup rddir read read write write > file remov chng get set ops ops ops bytes ops bytes > 10 12 8 919 7 102 0 32 975K 26 652K zfs > 6 21 10 1.22K 1 123 0 205 6.23M 4 33.5K zfs > 14 26 3 1.14K 9 127 0 46 1.33M 5 60.1K zfs > 13 11 8 1.02K 7 102 0 43 1.24M 22 514K zfs > 10 17 10 998 6 87 0 31 746K 85 2.45M zfs > 11 15 3 915 24 93 0 60 1.86M 6 54.3K zfs > 7 31 19 1.82K 5 167 0 23 636K 278 8.22M zfs > 14 22 13 1.44K 10 104 0 31 992K 257 7.84M zfs > 5 18 5 1.16K 4 80 0 26 764K 262 8.06M zfs > 1 19 6 572 2 75 0 19 579K 3 20.6K zfs> bash-3.00# zpool iostat -v p1 1 > capacity operations bandwidth > pool used avail read write read write > ------------------------- ----- ----- ----- ----- ----- ----- > p1 749G 67.2G 58 90 878K 903K > raidz 749G 67.2G 58 90 878K 903K > c4t500000E011909320d0 - - 15 40 959K 87.3K > c4t500000E011909300d0 - - 14 40 929K 86.5K > c4t500000E011903030d0 - - 18 40 1.11M 86.8K > c4t500000E011903300d0 - - 13 32 823K 77.7K > c4t500000E0119091E0d0 - - 15 40 961K 87.3K > c4t500000E0119032D0d0 - - 14 40 930K 86.5K > c4t500000E011903370d0 - - 18 40 1.11M 86.8K > c4t500000E011903190d0 - - 13 32 828K 77.8K > c4t500000E011903350d0 - - 15 40 964K 87.3K > c4t500000E0119095A0d0 - - 14 40 934K 86.5K > c4t500000E0119032A0d0 - - 18 40 1.11M 86.8K > c4t500000E011903340d0 - - 13 32 821K 77.7K > ------------------------- ----- ----- ----- ----- ----- -----
Hello Matthew,
Wednesday, May 31, 2006, 8:09:08 PM, you wrote:
MA> There are a few related questions that I think you want answered.
MA> 1. How does RAID-Z effect performance?
MA> When using RAID-Z, each filesystem block is spread across (typically)
MA> all disks in the raid-z group. So to a first approximation, each raid-z
MA> group provides the iops of a single disk (but the bandwidth of N-1
MA> disks). See Roch''s excellent article for a detailed
explanation:
MA> http://blogs.sun.com/roller/page/roch?entry=when_to_and_not_to
MA> 2. Why does ''zpool iostat'' not report actual i/os?
MA> 3. Why are we doing so many read i/os?
MA> 4. Why are we reading so much data?
MA> ''zpool iostat'' reports the i/os that are seen at each
level of the vdev
MA> tree, rather than the sum of the i/os that occur below that point in the
MA> vdev tree. This can provide some additional information when diagnosing
MA> performance problems. However, it is a bit counter-intuitive, so I
MA> always use iostat(1m). It may be clunky, but it does report on the
MA> actual i/os issued to the hardware. Also, I really like having the
MA> %busy reading, which ''zpool iostat'' does not provide.
MA> We are doing lots of read i/os because each block is spread out across
MA> all the disks in a raid-z group (as mentioned in Roch''s
article).
MA> However, the "vdev cache" is causing us to issue many *fewer*
i/os than
MA> would seem to be required, but reading much *more* data.
MA> For example, say we need to read a block of data. We''ll send
the read
MA> down to the raid-z vdev. The raid-z vdev knows that it the data is
MA> spread out over its disks, so it (essentially) issues one read zio_t to
MA> each of the disk vdevs to retrieve the data. Now each of those disk
MA> vdevs will first look in its vdev cache. If it finds the data there, it
MA> returns it without ever actually issuing an i/o to the hardware. If it
MA> doesn''t find it there, it will issue a 64k i/o to the hardware,
and put
MA> that 64k chunk into its vdev cache.
MA> Without the vdev cache, we would simply issue (Number of blocks to read)
MA> * (Number of disks in each raid-z vdev) read i/os to the hardware, and
MA> read the total number of bytes that you would expect, since each of
MA> those i/os would be for (approximately) 1/Ndisk bytes. However, with
MA> the vdev cache, we will issue fewer i/os, but read more data.
MA> 5. How can performance be improved?
MA> A. Use one big pool.
MA> Having 6 pools causes performance (and storage) to be stranded. When
MA> one filesystem is buiser than the others, it can only use the bandwidth
MA> and iops of its single raid-z vdev. If you had one big pool, that
MA> filesystem would be able to use all the disks in your system.
MA> B. Use smaller raid-z stripes.
MA> As Roch''s article explains, smaller raid-z stripes will provide
more
MA> iops. We generally suggest 3 to 9 disks in each raid-z stripe.
MA> C. Use higher-performance disks.
MA> I''m not sure what the underlying storage you''re using
is, but it''s
MA> pretty slow! As you can see from your per-disk iostat output, each
MA> device is only capable of 50-100 iops or 1-4MB/s, and takes on average
MA> over 100ms to service a request. If you are using some sort of hardware
MA> RAID enclosure, it may be working against you here. The perferred
MA> configuration would be to have each disk appear as a single device to
MA> the system. (This should be possible even with fancy RAID hardware.)
MA> So in conculsion, you can improve performance by creating one big pool
MA> with several raid-z stripes, each with 3 to 9 disks in it. These disks
MA> should be actual physical disks.
MA> Hope this helps,
That helps a lot - thank you.
I wish I knew it before... Information Roch put on his blog should be
explained both in MAN pages and ZFS Admin Guide - as this is something
one would not expect.
It actually means raid-z is useless in many enviroments compare to
traditional raid-5.
Now I use 3510 JBODs connected on two loops with MPxIO.
Disks are 73GB 15K so they should be quite fast.
Now I have to find out how to go away from raid-z...
--
Best regards,
Robert mailto:rmilkowski at task.gda.pl
http://milek.blogspot.com
> That helps a lot - thank you. > I wish I knew it before... Information Roch put on his blog should be > explained both in MAN pages and ZFS Admin Guide - as this is something > one would not expect. > > It actually means raid-z is useless in many enviroments compare to > traditional raid-5.Well, it''s a trade-off. With RAID-5 you pay the RAID tax on writes; with RAID-Z you pay the tax on reads. There''s also another factor at play here, which is purely a matter of implementation that we need to fix. With a RAID-Z setup, all blocks are written in RAID-Z format -- even intent log blocks, which is really stupid. If you do a lot of synchronous writes, that really hurts your write bandwidth. But it''s unnecessary. Since we know that intent log blocks don''t live for more than a single transaction group (which is about five seconds), there''s no reason to allocate them space-efficiently. It would be far better, when allocating a B-byte intent log block in an N-disk RAID-Z group, to allocate B*N bytes but only write to one disk (or two if you want to be paranoid). This simple change should make synchronous I/O on N-way RAID-Z up to N times faster. Jeff
Jeff Bonwick wrote:> ... > >Since we know that intent log blocks don''t live for more than a >single transaction group (which is about five seconds), there''s >no reason to allocate them space-efficiently. It would be far >better, when allocating a B-byte intent log block in an N-disk >RAID-Z group, to allocate B*N bytes but only write to one disk >(or two if you want to be paranoid). This simple change should >make synchronous I/O on N-way RAID-Z up to N times faster. > >Would it make sense to keep the space-efficient allocation code around for times when disk space gets tight (as in less than 100 free blocks or similar) ? Darren
Hello Jeff, Thursday, June 1, 2006, 10:36:18 AM, you wrote:>> That helps a lot - thank you. >> I wish I knew it before... Information Roch put on his blog should be >> explained both in MAN pages and ZFS Admin Guide - as this is something >> one would not expect. >> >> It actually means raid-z is useless in many enviroments compare to >> traditional raid-5.JB> Well, it''s a trade-off. With RAID-5 you pay the RAID tax on writes; JB> with RAID-Z you pay the tax on reads. I know - I only wish I new better - raid-z should be explained better in a documentation. btw: what differences there''ll be between raidz1 and raidz2? I guess two checksums will be stored so one loose approximately space of two disks in a one raidz2 group. Any other things? JB> There''s also another factor at play here, which is purely a matter JB> of implementation that we need to fix. With a RAID-Z setup, all JB> blocks are written in RAID-Z format -- even intent log blocks, JB> which is really stupid. If you do a lot of synchronous writes, JB> that really hurts your write bandwidth. But it''s unnecessary. JB> Since we know that intent log blocks don''t live for more than a JB> single transaction group (which is about five seconds), there''s JB> no reason to allocate them space-efficiently. It would be far JB> better, when allocating a B-byte intent log block in an N-disk JB> RAID-Z group, to allocate B*N bytes but only write to one disk JB> (or two if you want to be paranoid). This simple change should JB> make synchronous I/O on N-way RAID-Z up to N times faster. Would be probably very useful on nfs servers. btw: just a quick thought - why not to write one block only on 2 disks (+checksum on a one disk) instead of spreading one fs block to N-1 disks? That way zfs could read many fs block at the same time in case of larger raid-z pools. ? -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Roch Bourbonnais - Performance Engineering
2006-Jun-01 13:00 UTC
[zfs-discuss] Re: Big IOs overhead due to ZFS?
Robert Milkowski writes: > > > > btw: just a quick thought - why not to write one block only on 2 disks > (+checksum on a one disk) instead of spreading one fs block to N-1 > disks? That way zfs could read many fs block at the same time in case > of larger raid-z pools. ? That''s what you have today with a dynamic stripe of (2+1) raid-z vdevs. If the user requests more devices in a group, it''s because he wants to have those disk blocks for the storage. -r > > -- > Best regards, > Robert mailto:rmilkowski at task.gda.pl > http://milek.blogspot.com > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Thu, 2006-06-01 at 04:36, Jeff Bonwick wrote:> It would be far > better, when allocating a B-byte intent log block in an N-disk > RAID-Z group, to allocate B*N bytes but only write to one disk > (or two if you want to be paranoid). This simple change should > make synchronous I/O on N-way RAID-Z up to N times faster.I dunno about *wanting* to be paranoid... I''d think that, to provide the same level of survivability in the face of disk failure as regular data, you''d need to write to a minimum of 2 disks in a regular RAID-Z and a minimum of three in a dual-parity raid-Z ... - Bill
Hello Roch,
Thursday, June 1, 2006, 3:00:46 PM, you wrote:
RBPE> Robert Milkowski writes:
>>
>>
>>
>> btw: just a quick thought - why not to write one block only on 2 disks
>> (+checksum on a one disk) instead of spreading one fs block to N-1
>> disks? That way zfs could read many fs block at the same time in case
>> of larger raid-z pools. ?
RBPE> That''s what you have today with a dynamic stripe of (2+1)
RBPE> raid-z vdevs. If the user requests more devices in a group,
RBPE> it''s because he wants to have those disk blocks for the
RBPE> storage.
Yeah, that''s right - silly me :)
--
Best regards,
Robert mailto:rmilkowski at task.gda.pl
http://milek.blogspot.com
On Thu, Jun 01, 2006 at 02:46:32PM +0200, Robert Milkowski wrote:> btw: what differences there''ll be between raidz1 and raidz2? I guess > two checksums will be stored so one loose approximately space of two > disks in a one raidz2 group. Any other things?The difference between raidz1 and raidz2 is just that the latter is resilient against losing 2 disks rather than just 1. If you have a total of 5 disks in a raidz1 stripe your optimal capacity will be 4/5ths of the raw capacity of the disks whereas it would be 3/5ths with raidz2. Consider however that you''ll typically use larger stripes with raidz2 so you aren''t necessarily going to "lose" any capacity depending on how you configure your pool. Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Hello Adam, Friday, June 2, 2006, 12:10:47 AM, you wrote: AL> On Thu, Jun 01, 2006 at 02:46:32PM +0200, Robert Milkowski wrote:>> btw: what differences there''ll be between raidz1 and raidz2? I guess >> two checksums will be stored so one loose approximately space of two >> disks in a one raidz2 group. Any other things?AL> The difference between raidz1 and raidz2 is just that the latter is AL> resilient against losing 2 disks rather than just 1. If you have a total AL> of 5 disks in a raidz1 stripe your optimal capacity will be 4/5ths of the AL> raw capacity of the disks whereas it would be 3/5ths with raidz2. Consider AL> however that you''ll typically use larger stripes with raidz2 so you aren''t AL> necessarily going to "lose" any capacity depending on how you configure your AL> pool. If I have 6 disks - wouldn''t a pool with 2x raidz1 (3 disks) be actually faster than a pool with raidz2 (6disks)? (many small random reads)? I know that redundancy with raidz2 would be better as any 3 disks can fail while with 2xraidz only one disk from each raidz1 group can fail. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com