Jim Klimov
2011-Jun-01  20:29 UTC
[zfs-discuss] How to properly read "zpool iostat -v" ? ;)
Hello experts,
I''ve had a lingering question for some time: when I
use "zpool iostat -v" the values do not quite sum up.
In the example below with a raidz2 array made of 6
drives:
* the reported 33K of writes are less than two disks''
   workload at this time (at 17.9K each), overall
   disks writes are 107.4K = 325% of 33K.
* write ops sum up to 18 = 225% of 8 ops to the pool;
* read MB on dirves is 66.2M = 190% of 34.8M read
   from pool;
* read ops on drives sum up to 2014, but pool IOs
   are reported as 804 (disk work is 250% higher);
Neither difference is the same, and neither matches
the 150% (50% increase) I had expected from 2 parity
drives with 4 data drives worth of data.
Per-disk values are consistent with "iostat -Xn"
output. So I think that accounting of pool/vdev IO
is lying somewhere...
I''m curious: What''s up? ;)
                capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
pool        9.53T  1.34T    804      8  34.8M  33.0K
   raidz2    9.53T  1.34T    804      8  34.8M  33.0K
     c7t0d0      -      -    296      3  11.9M  17.9K
     c7t1d0      -      -    338      3  10.7M  17.9K
     c7t2d0      -      -    367      3  10.4M  17.9K
     c7t3d0      -      -    333      3  12.0M  17.9K
     c7t4d0      -      -    340      3  10.7M  17.9K
     c7t5d0      -      -    340      3  10.5M  17.9K
cache           -      -      -      -      -      -
   c4t1d0p2   230M  15.8G      0      0      0      0
   c4t1d0p3   231M  15.5G      0      0      0      0
----------  -----  -----  -----  -----  -----  -----
Sometimes values are even more weird, i.e. here pool
reads are even less than one component drive''s workload,
not to mention all six of them:
                capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
pool        9.53T  1.34T    404      0  4.41M      0
   raidz2    9.53T  1.34T    404      0  4.41M      0
     c7t0d0      -      -    115      0  4.25M      0
     c7t1d0      -      -    141      0  4.50M      0
     c7t2d0      -      -    133      0  4.41M      0
     c7t3d0      -      -    133      0  4.22M      0
     c7t4d0      -      -    138      0  4.47M      0
     c7t5d0      -      -    141      0  4.45M      0
cache           -      -      -      -      -      -
   c4t1d0p2   230M  15.8G      0      0      0      0
   c4t1d0p3   231M  15.5G      0      0      0      0
----------  -----  -----  -----  -----  -----  -----
                capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
pool        9.53T  1.34T    371      7  3.74M  29.3K
   raidz2    9.53T  1.34T    371      7  3.74M  29.3K
     c7t0d0      -      -    115      2  3.62M  16.5K
     c7t1d0      -      -    132      2  3.70M  15.3K
     c7t2d0      -      -    137      3  3.66M  15.3K
     c7t3d0      -      -    138      3  3.55M  15.3K
     c7t4d0      -      -    145      3  3.69M  16.5K
     c7t5d0      -      -    138      2  3.69M  16.5K
cache           -      -      -      -      -      -
   c4t1d0p2   230M  15.8G      0      0      0      0
   c4t1d0p3   231M  15.5G      0      0      0      0
----------  -----  -----  -----  -----  -----  -----
Thanks,
//Jim Klimov
Marty Scholes
2011-Jun-02  14:45 UTC
[zfs-discuss] How to properly read "zpool iostat -v" ? ;)
While I am by no means on expert on this, I went through a similar mental exercise previously and came to the conclusion that in order to service a particular read request, zfs may need to read more from the disk. For example, a 16KB request in a stripe might need to retrieve the full 128KB stripe, if only to verify the checksum of the stripe prior to returning 16KB to the OS. If I have understand it correctly, then the vdev numbers refer to the amount of data returned to the OS to satisfy requests, while the individual disk numbers refer to the amount of disk I/O required to satisfy the requests. Does that make sense? Standard disclaimers apply: I could be wrong, I often am wrong, etc. -- This message posted from opensolaris.org