Daniel Rock
2006-May-26 19:40 UTC
[zfs-discuss] ZFS mirror and read policy; kstat I/O values for zfs
Hi, after some testing with ZFS I noticed that read requests are not scheduled even to the drives but the first one gets predominately selected: My pool is setup as follows: NAME STATE READ WRITE CKSUM tpc ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t0d0 ONLINE 0 0 0 c4t0d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c4t2d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c4t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t6d0 ONLINE 0 0 0 c4t6d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 c4t7d0 ONLINE 0 0 0 Disk I/O after doing some benchmarking: capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- tpc 7.70G 50.9G 85 21 10.5M 1.08M mirror 1.10G 7.28G 11 3 1.47M 159K c1t0d0 - - 10 2 1.34M 159K c4t0d0 - - 1 2 138K 159K mirror 1.10G 7.27G 11 3 1.48M 159K c1t1d0 - - 10 2 1.34M 159K c4t1d0 - - 1 2 140K 159K mirror 1.09G 7.28G 12 3 1.50M 159K c1t2d0 - - 10 2 1.37M 159K c4t2d0 - - 0 2 128K 159K mirror 1.10G 7.28G 12 3 1.53M 158K c1t3d0 - - 11 2 1.42M 158K c4t3d0 - - 0 2 110K 158K mirror 1.10G 7.28G 11 3 1.44M 158K c1t4d0 - - 10 2 1.33M 158K c4t4d0 - - 0 2 112K 158K mirror 1.10G 7.28G 12 3 1.53M 158K c1t6d0 - - 11 2 1.42M 158K c4t6d0 - - 0 2 106K 158K mirror 1.11G 7.26G 12 3 1.55M 158K c1t7d0 - - 11 2 1.42M 158K c4t7d0 - - 1 2 130K 158K ---------- ----- ----- ----- ----- ----- ----- or with "iostat" 11.4 4.3 1451.1 157.1 0.0 0.3 0.4 19.6 0 17 c1t7d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t5d0 10.7 4.3 1361.4 158.4 0.0 0.3 0.4 22.1 0 18 c1t0d0 10.9 4.3 1395.7 157.9 0.0 0.3 0.4 18.6 0 16 c1t2d0 1.0 4.3 129.0 157.1 0.0 0.0 0.8 8.9 0 2 c4t7d0 0.9 4.3 112.0 156.9 0.0 0.0 0.9 9.4 0 2 c4t4d0 1.1 4.4 139.5 158.3 0.0 0.0 0.9 8.8 0 3 c4t1d0 10.6 4.3 1354.8 157.0 0.0 0.3 0.4 18.8 0 16 c1t4d0 0.9 4.3 109.2 157.3 0.0 0.1 0.9 9.7 0 3 c4t3d0 10.7 4.4 1363.4 158.3 0.0 0.3 0.4 21.9 0 18 c1t1d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t8d0 1.0 4.3 127.0 157.8 0.0 0.0 0.9 9.0 0 2 c4t2d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c1t8d0 11.4 4.3 1449.9 156.9 0.0 0.3 0.4 20.0 0 17 c1t6d0 0.8 4.3 105.4 156.8 0.0 0.0 0.9 8.5 0 2 c4t6d0 11.3 4.3 1447.4 157.4 0.0 0.3 0.4 18.9 0 17 c1t3d0 1.1 4.4 137.7 158.4 0.0 0.0 0.9 8.8 0 2 c4t0d0 So you can see the second disk of each mirror pair (c4tXd0) gets almost no I/O. How does ZFS decide from which mirror device to read? And just another notice: SVM does offer kstat values of type KSTAT_TYPE_IO. Why not ZFS (at least on zpool level)? And BTW (not ZFS related, but SVM): With the introduction of the SVM bunnahabhain project (friendly names) "iostat -n" output is now completely useless - even if you still use the old naming scheme: % iostat -n extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.3 0 0 c0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.4 0 0 c0d1 0.0 5.0 0.7 21.8 0.0 0.0 0.0 1.5 0 1 c3d0 0.0 4.1 0.6 20.9 0.0 0.0 0.0 2.8 0 1 c4d0 1.6 37.3 16.6 164.3 0.1 0.1 2.5 1.6 1 5 c2d0 1.6 37.5 16.5 164.5 0.1 0.1 3.2 1.7 1 5 c1d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 fd0 2.9 1.9 19.3 4.8 0.0 0.2 0.3 37.2 0 1 md5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 19.9 0 0 md12 0.0 0.0 0.0 0.0 0.0 0.0 0.0 12.4 0 0 md13 0.0 0.0 0.0 0.0 0.0 0.0 3.9 17.7 0 0 md14 1.5 1.9 9.6 4.8 0.0 0.1 0.0 35.7 0 0 md15 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 md16 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 md17 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 md18 1.5 1.9 9.6 4.8 0.0 0.1 0.0 27.7 0 0 md19 Instead of "mdXXX" is was expecting the following names: % ls -lL /dev/md/dsk Gesamt 0 brw-r----- 1 root sys 85, 5 Mai 26 00:43 d1 brw-r----- 1 root sys 85, 15 Mai 26 00:43 root-0 brw-r----- 1 root sys 85, 19 Mai 26 00:43 root-1 brw-r----- 1 root sys 85, 18 Mai 26 00:43 scratch brw-r----- 1 root sys 85, 16 Mai 26 00:43 scratch-0 brw-r----- 1 root sys 85, 17 Mai 26 00:43 scratch-1 brw-r----- 1 root sys 85, 14 Mai 25 17:51 swap brw-r----- 1 root sys 85, 12 Mai 26 00:43 swap-0 brw-r----- 1 root sys 85, 13 Mai 26 00:43 swap-1 Daniel
Matthew Ahrens
2006-May-26 23:45 UTC
[zfs-discuss] ZFS mirror and read policy; kstat I/O values for zfs
On Fri, May 26, 2006 at 09:40:57PM +0200, Daniel Rock wrote:> So you can see the second disk of each mirror pair (c4tXd0) gets almost no > I/O. How does ZFS decide from which mirror device to read?You are almost certainly running in to this known bug: 6302222 reads from mirror are not spread evenly --matt
Jeff Bonwick
2006-May-27 01:04 UTC
[zfs-discuss] ZFS mirror and read policy; kstat I/O values for zfs
> You are almost certainly running in to this known bug: > > 6302222 reads from mirror are not spread evenlyRight. FYI, we fixed this in build 38. Jeff
Haik Aftandilian
2006-May-30 18:39 UTC
[zfs-discuss] Re: ZFS mirror and read policy; kstat I/O values for zfs
Could we get this bug report updated? Jeff, you marked it as fix delivered but I see no 6302222 putback in the NV gate history. There are also no diffs on the report and I see no indication as to which putback did fix the problem. Haik This message posted from opensolaris.org
Neil Perrin
2006-May-30 19:13 UTC
[zfs-discuss] Re: ZFS mirror and read policy; kstat I/O values for zfs
Haik, 6302222 is marked as fixed in snv_38. The Evaluation mentions this was fixed as part of the ditto block work. This bug is: 6410698 ZFS metadata needs to be more highly replicated (ditto blocks) (which is also marked as fixed in snv_38). Hope that helps: Neil. Haik Aftandilian wrote On 05/30/06 12:39,:> Could we get this bug report updated? Jeff, you marked it as fix delivered but I see no 6302222 putback in the NV gate history. There are also no diffs on the report and I see no indication as to which putback did fix the problem. > > Haik > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Haik Aftandilian
2006-May-30 20:47 UTC
[zfs-discuss] Re: ZFS mirror and read policy; kstat I/O values for zfs
> 6302222 is marked as fixed in snv_38. The Evaluation mentions this > was fixed as part of the ditto block work. This bug is: > > 6410698 ZFS metadata needs to be more highly replicated (ditto > blocks) (which is also marked as fixed in snv_38). > > Hope that helps: Neil.Ah, thanks. Technically, this bug should not be marked as fixed. It should be closed as a dupe or just closed as "will not fix" with a comment indicating it was fixed by 6410698. Reason being the 6410698 putback did not include 6302222 in the putback comments so there is no record of 6302222 in the source tree. Anyone who comes across the bug and is interested in seeing the source code changes will not be able to do so. Haik
Bill Sommerfeld
2006-May-30 21:20 UTC
[zfs-discuss] Re: ZFS mirror and read policy; kstat I/O values for zfs
On Tue, 2006-05-30 at 16:47, Haik Aftandilian wrote:> Technically, this bug should not be marked as fixed. It should be closed > as a dupe or just closed as "will not fix" with a comment indicating it > was fixed by 6410698.In past cases like this, I was told to close it as "unreproduceable" rather than "will not fix" (again with a note indicating the bug which probably fixed it..) - Bill