So does anyone have any insight on BugID 6535160? We have verified on a similar system, that ZFS shows big latency in filebench varmail test. We formatted the same LUN with UFS and latency went down from 300 ms to 1-2 ms. http://sunsolve.sun.com/search/document.do?assetkey=1-1-6535160-1 We run Solaris 10u4 on our production systems, don''t see any indication of a patch for this. I''ll try downloading recent Nevada build and load it on same system and see if the problem has indeed vanished post snv_71. This message posted from opensolaris.org
Vincent Fox wrote:> So does anyone have any insight on BugID 6535160? > > We have verified on a similar system, that ZFS shows big latency in filebench varmail test. > > We formatted the same LUN with UFS and latency went down from 300 ms to 1-2 ms.This is such a big difference it makes me think something else is going on. I suspect one of two possible causes: A) The disk write cache is enabled and volatile. UFS knows nothing of write caches and requires the write cache to be disabled otherwise corruption can occur. B) The write cache is non volatile, but ZFS hasn''t been configured to stop flushing it (set zfs:zfs_nocacheflush = 1). Note, ZFS enables the write cache and will flush it as necessary.> > http://sunsolve.sun.com/search/document.do?assetkey=1-1-6535160-1 > > We run Solaris 10u4 on our production systems, don''t see any indication > of a patch for this. > > I''ll try downloading recent Nevada build and load it on same system and see > if the problem has indeed vanished post snv_71.Yes please try this. I think it will make a difference but the delta will be small. Neil.
Vincent Fox wrote:> So does anyone have any insight on BugID 6535160? > > We have verified on a similar system, that ZFS shows big latency in filebench varmail test. > > We formatted the same LUN with UFS and latency went down from 300 ms to 1-2 ms.This is such a big difference it makes me think something else is going on. I suspect one of two possible causes: A) The disk write cache is enabled and volatile. UFS knows nothing of write caches and requires the write cache to be disabled otherwise corruption can occur. B) The write cache is non volatile, but ZFS hasn''t been configured to stop flushing it (set zfs:zfs_nocacheflush = 1). Note, ZFS enables the write cache and will flush it as necessary.> > http://sunsolve.sun.com/search/document.do?assetkey=1-1-6535160-1 > > We run Solaris 10u4 on our production systems, don''t see any indication > of a patch for this. > > I''ll try downloading recent Nevada build and load it on same system and see > if the problem has indeed vanished post snv_71.Yes please try this. I think it will make a difference but the delta will be small. Neil.
> ) The write cache is non volatile, but ZFS hasn''t > been configured > to stop flushing it (set zfs:zfs_nocacheflush > 1).These are a pair of 2540 with dual-controllers, definitely non-volatile cache. We set the zfs_nocacheflush=1 and that improved things considerably. ZFS filesystem (2540 arrays): fsyncfile3 434ops/s 0.0mb/s 17.3ms/op 977us/op-cpu fsyncfile2 434ops/s 0.0mb/s 17.8ms/op 981us/op-cpu However still not very good compared to UFS. We turned off ZIL with zil_disable=1 and WOW! ZFS ZIL disabled: fsyncfile3 1148ops/s 0.0mb/s 0.0ms/op 18us/op-cpu fsyncfile2 1148ops/s 0.0mb/s 0.0ms/op 18us/op-cpu Not a good setting to use in production but useful data. Anyhow will take some time to get OpenSolaris onto the system, will report back then. This message posted from opensolaris.org
We loaded Nevada_78 on a peer T2000 unit. Imported the same ZFS pool. I didn''t even upgrade the pool since we wanted to be able to move it back to 10u4. Cut ''n paste of my colleague''s email with the results: Here''s the latest Pepsi Challenge results. Sol10u4 vs Nevada78. Same tuning options, same zpool, same storage, same SAN switch - you get the idea. The only difference is the OS. Sol10u4: 4984: 82.878: Per-Operation Breakdown closefile4 404ops/s 0.0mb/s 0.0ms/op 19us/op-cpu readfile4 404ops/s 6.3mb/s 0.1ms/op 109us/op-cpu openfile4 404ops/s 0.0mb/s 0.1ms/op 112us/op-cpu closefile3 404ops/s 0.0mb/s 0.0ms/op 25us/op-cpu fsyncfile3 404ops/s 0.0mb/s 18.7ms/op 1168us/op-cpu appendfilerand3 404ops/s 6.3mb/s 0.2ms/op 192us/op-cpu readfile3 404ops/s 6.3mb/s 0.1ms/op 111us/op-cpu openfile3 404ops/s 0.0mb/s 0.1ms/op 111us/op-cpu closefile2 404ops/s 0.0mb/s 0.0ms/op 24us/op-cpu fsyncfile2 404ops/s 0.0mb/s 19.0ms/op 1162us/op-cpu appendfilerand2 404ops/s 6.3mb/s 0.2ms/op 173us/op-cpu createfile2 404ops/s 0.0mb/s 0.3ms/op 334us/op-cpu deletefile1 404ops/s 0.0mb/s 0.2ms/op 173us/op-cpu 4984: 82.879: IO Summary: 318239 ops 5251.8 ops/s, (808/808 r/w) 25.2mb/s, 1228us cpu/op, 9.7ms latency Nevada78: 1107: 82.554: Per-Operation Breakdown closefile4 1223ops/s 0.0mb/s 0.0ms/op 22us/op-cpu readfile4 1223ops/s 19.4mb/s 0.1ms/op 112us/op-cpu openfile4 1223ops/s 0.0mb/s 0.1ms/op 128us/op-cpu closefile3 1223ops/s 0.0mb/s 0.0ms/op 29us/op-cpu fsyncfile3 1223ops/s 0.0mb/s 4.6ms/op 256us/op-cpu appendfilerand3 1223ops/s 19.1mb/s 0.2ms/op 191us/op-cpu readfile3 1223ops/s 19.9mb/s 0.1ms/op 116us/op-cpu openfile3 1223ops/s 0.0mb/s 0.1ms/op 127us/op-cpu closefile2 1223ops/s 0.0mb/s 0.0ms/op 28us/op-cpu fsyncfile2 1223ops/s 0.0mb/s 4.4ms/op 239us/op-cpu appendfilerand2 1223ops/s 19.1mb/s 0.1ms/op 159us/op-cpu createfile2 1223ops/s 0.0mb/s 0.5ms/op 389us/op-cpu deletefile1 1223ops/s 0.0mb/s 0.2ms/op 198us/op-cpu 1107: 82.581: IO Summary: 954637 ops 15903.4 ops/s, (2447/2447 r/w) 77.5mb/s, 590us cpu/op, 2.6ms latency That''s a 3-4x improvement in ops/sec and average fsync time. Here are the results from our UFS software mirror for comparison: 4984: 211.056: Per-Operation Breakdown closefile4 465ops/s 0.0mb/s 0.0ms/op 23us/op-cpu readfile4 465ops/s 12.6mb/s 0.1ms/op 142us/op-cpu openfile4 465ops/s 0.0mb/s 0.1ms/op 83us/op-cpu closefile3 465ops/s 0.0mb/s 0.0ms/op 24us/op-cpu fsyncfile3 465ops/s 0.0mb/s 6.0ms/op 498us/op-cpu appendfilerand3 465ops/s 7.3mb/s 1.7ms/op 282us/op-cpu readfile3 465ops/s 11.1mb/s 0.1ms/op 132us/op-cpu openfile3 465ops/s 0.0mb/s 0.1ms/op 84us/op-cpu closefile2 465ops/s 0.0mb/s 0.0ms/op 26us/op-cpu fsyncfile2 465ops/s 0.0mb/s 5.9ms/op 445us/op-cpu appendfilerand2 465ops/s 7.3mb/s 1.1ms/op 231us/op-cpu createfile2 465ops/s 0.0mb/s 2.2ms/op 443us/op-cpu deletefile1 465ops/s 0.0mb/s 2.0ms/op 269us/op-cpu 4984: 211.057: IO Summary: 366557 ops 6049.2 ops/s, (931/931 r/w) 38.2mb/s, 912us cpu/op, 4.8ms latency So either we''re hitting a pretty serious zfs bug, or they''re purposely holding back performance in Solaris 10 so that we all have a good reason to upgrade to 11. ;) -Nick This message posted from opensolaris.org
> > So either we''re hitting a pretty serious zfs bug, or they''re purposely > holding back performance in Solaris 10 so that we all have a good > reason to > upgrade to 11. ;)In general, for ZFS we try to push all changes from Nevada back to s10 updates. In particular, "6535160 Lock contention on zl_lock from zil_commit" is pegged for s10u6. And i believe we''re going for an early build of update 6, so point patches should hopefully be available even earlier. Nice to see filebench validating our performance work, eric