Robert Milkowski
2006-Aug-07 15:22 UTC
[zfs-discuss] 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hi.
3510 with two HW controllers, configured on LUN in RAID-10 using 12 disks in
head unit (FC-AL 73GB 15K disks). Optimization set to random, stripe size 32KB.
Connected to v440 using two links, however in tests only one link was used (no
MPxIO).
I used filebench and varmail test with default parameters and run for 60s, test
was run twice.
System is S10U2 with all available patches (all support patches), kernel -18.
ZFS filesystem on HW lun with atime=off:
IO Summary: 499078 ops 8248.0 ops/s, (1269/1269 r/w) 40.6mb/s, 314us
cpu/op, 6.0ms latency
IO Summary: 503112 ops 8320.2 ops/s, (1280/1280 r/w) 41.0mb/s, 296us
cpu/op, 5.9ms latency
Now the same LUN but ZFS was destroyed and UFS filesystem was created.
UFS filesystem on HW lun with maxcontig=24 and noatime:
IO Summary: 401671 ops 6638.2 ops/s, (1021/1021 r/w) 32.7mb/s, 404us
cpu/op, 7.5ms latency
IO Summary: 403194 ops 6664.5 ops/s, (1025/1025 r/w) 32.5mb/s, 406us
cpu/op, 7.5ms latency
Now another v440 server (the same config) with snv_44, connected several 3510
JBODS on two FC-loops however only one loop was used (no MPxIO). The same disks
(73GB FC-AL 15K).
ZFS filesystem with atime=off with ZFS raid-10 using 12 disks from one
enclosure:
IO Summary: 558331 ops 9244.1 ops/s, (1422/1422 r/w) 45.2mb/s, 312us
cpu/op, 5.2ms latency
IO Summary: 537542 ops 8899.9 ops/s, (1369/1369 r/w) 43.5mb/s, 307us
cpu/op, 5.4ms latency
### details ####
$ cat zfs-benhc.txt
v440, Generic_118833-18
filebench> set $dir=/se3510_hw_raid10_12disks/t1/
filebench> run 60
582: 42.107: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
582: 42.108: Creating fileset bigfileset...
582: 45.262: Preallocated 812 of 1000 of fileset bigfileset in 4 seconds
582: 45.262: Creating/pre-allocating files
582: 45.262: Starting 1 filereader instances
586: 46.268: Starting 16 filereaderthread threads
582: 49.278: Running...
582: 109.787: Run took 60 seconds...
582: 109.801: Per-Operation Breakdown
closefile4 634ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 634ops/s 10.3mb/s 0.1ms/op 65us/op-cpu
openfile4 634ops/s 0.0mb/s 0.1ms/op 63us/op-cpu
closefile3 634ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 634ops/s 0.0mb/s 11.3ms/op 150us/op-cpu
appendfilerand3 635ops/s 9.9mb/s 0.1ms/op 132us/op-cpu
readfile3 635ops/s 10.4mb/s 0.1ms/op 66us/op-cpu
openfile3 635ops/s 0.0mb/s 0.1ms/op 63us/op-cpu
closefile2 635ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 635ops/s 0.0mb/s 11.9ms/op 137us/op-cpu
appendfilerand2 635ops/s 9.9mb/s 0.1ms/op 94us/op-cpu
createfile2 634ops/s 0.0mb/s 0.2ms/op 163us/op-cpu
deletefile1 634ops/s 0.0mb/s 0.1ms/op 86us/op-cpu
582: 109.801:
IO Summary: 499078 ops 8248.0 ops/s, (1269/1269 r/w) 40.6mb/s, 314us
cpu/op, 6.0ms latency
582: 109.801: Shutting down processes
filebench>
filebench> run 60
582: 190.655: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
582: 190.720: Removed any existing fileset bigfileset in 1 seconds
582: 190.720: Creating fileset bigfileset...
582: 193.259: Preallocated 786 of 1000 of fileset bigfileset in 3 seconds
582: 193.259: Creating/pre-allocating files
582: 193.259: Starting 1 filereader instances
591: 194.268: Starting 16 filereaderthread threads
582: 197.278: Running...
582: 257.748: Run took 60 seconds...
582: 257.761: Per-Operation Breakdown
closefile4 640ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 640ops/s 10.5mb/s 0.1ms/op 64us/op-cpu
openfile4 640ops/s 0.0mb/s 0.1ms/op 63us/op-cpu
closefile3 640ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 640ops/s 0.0mb/s 11.1ms/op 147us/op-cpu
appendfilerand3 640ops/s 10.0mb/s 0.1ms/op 124us/op-cpu
readfile3 640ops/s 10.5mb/s 0.1ms/op 67us/op-cpu
openfile3 640ops/s 0.0mb/s 0.1ms/op 63us/op-cpu
closefile2 640ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 640ops/s 0.0mb/s 11.9ms/op 139us/op-cpu
appendfilerand2 640ops/s 10.0mb/s 0.1ms/op 89us/op-cpu
createfile2 640ops/s 0.0mb/s 0.2ms/op 157us/op-cpu
deletefile1 640ops/s 0.0mb/s 0.1ms/op 87us/op-cpu
582: 257.761:
IO Summary: 503112 ops 8320.2 ops/s, (1280/1280 r/w) 41.0mb/s, 296us
cpu/op, 5.9ms latency
582: 257.761: Shutting down processes
filebench>
bash-3.00# zpool destroy se3510_hw_raid10_12disks
bash-3.00# newfs -C 24 /dev/rdsk/c3t40d0s0
newfs: construct a new file system /dev/rdsk/c3t40d0s0: (y/n)? y
Warning: 4164 sector(s) in last cylinder unallocated
/dev/rdsk/c3t40d0s0: 857083836 sectors in 139500 cylinders of 48 tracks, 128
sectors
418498.0MB in 8719 cyl groups (16 c/g, 48.00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
...............................................................................
...............................................................................
................
super-block backups for last 10 cylinder groups at:
856130208, 856228640, 856327072, 856425504, 856523936, 856622368, 856720800,
856819232, 856917664, 857016096
bash-3.00# mount -o noatime /dev/dsk/c3t40d0s0 /mnt/
bash-3.00#
bash-3.00# /opt/filebench/bin/sparcv9/filebench
filebench> load varmail
632: 2.758: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully
loaded
632: 2.759: Usage: set $dir=<dir>
632: 2.759: set $filesize=<size> defaults to 16384
632: 2.759: set $nfiles=<value> defaults to 1000
632: 2.759: set $nthreads=<value> defaults to 16
632: 2.759: set $meaniosize=<value> defaults to 16384
632: 2.759: set $meandirwidth=<size> defaults to 1000000
632: 2.759: (sets mean dir width and dir depth is calculated as log (width,
nfiles)
632: 2.759: dirdepth therefore defaults to dir depth of 1 as in postmark
632: 2.759: set $meandir lower to increase depth beyond 1 if desired)
632: 2.759:
632: 2.759: run runtime (e.g. run 60)
632: 2.759: syntax error, token expected on line 51
filebench> set $dir=/mnt/
filebench> run 60
632: 7.699: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
632: 7.722: Creating fileset bigfileset...
632: 10.611: Preallocated 812 of 1000 of fileset bigfileset in 3 seconds
632: 10.611: Creating/pre-allocating files
632: 10.611: Starting 1 filereader instances
633: 11.615: Starting 16 filereaderthread threads
632: 14.625: Running...
632: 75.135: Run took 60 seconds...
632: 75.149: Per-Operation Breakdown
closefile4 511ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 511ops/s 8.4mb/s 0.1ms/op 65us/op-cpu
openfile4 511ops/s 0.0mb/s 0.0ms/op 37us/op-cpu
closefile3 511ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile3 511ops/s 0.0mb/s 9.7ms/op 168us/op-cpu
appendfilerand3 511ops/s 8.0mb/s 2.6ms/op 190us/op-cpu
readfile3 511ops/s 8.3mb/s 0.1ms/op 65us/op-cpu
openfile3 511ops/s 0.0mb/s 0.0ms/op 37us/op-cpu
closefile2 511ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile2 511ops/s 0.0mb/s 8.4ms/op 152us/op-cpu
appendfilerand2 511ops/s 8.0mb/s 1.7ms/op 170us/op-cpu
createfile2 511ops/s 0.0mb/s 4.3ms/op 297us/op-cpu
deletefile1 511ops/s 0.0mb/s 3.1ms/op 145us/op-cpu
632: 75.149:
IO Summary: 401671 ops 6638.2 ops/s, (1021/1021 r/w) 32.7mb/s, 404us
cpu/op, 7.5ms latency
632: 75.149: Shutting down processes
filebench> run 60
632: 193.974: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
632: 194.874: Removed any existing fileset bigfileset in 1 seconds
632: 194.875: Creating fileset bigfileset...
632: 196.817: Preallocated 786 of 1000 of fileset bigfileset in 2 seconds
632: 196.817: Creating/pre-allocating files
632: 196.817: Starting 1 filereader instances
636: 197.825: Starting 16 filereaderthread threads
632: 200.835: Running...
632: 261.335: Run took 60 seconds...
632: 261.350: Per-Operation Breakdown
closefile4 513ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 513ops/s 8.2mb/s 0.1ms/op 64us/op-cpu
openfile4 513ops/s 0.0mb/s 0.0ms/op 38us/op-cpu
closefile3 513ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile3 513ops/s 0.0mb/s 9.7ms/op 169us/op-cpu
appendfilerand3 513ops/s 8.0mb/s 2.7ms/op 189us/op-cpu
readfile3 513ops/s 8.3mb/s 0.1ms/op 65us/op-cpu
openfile3 513ops/s 0.0mb/s 0.0ms/op 38us/op-cpu
closefile2 513ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile2 513ops/s 0.0mb/s 8.4ms/op 154us/op-cpu
appendfilerand2 513ops/s 8.0mb/s 1.7ms/op 165us/op-cpu
createfile2 513ops/s 0.0mb/s 4.2ms/op 301us/op-cpu
deletefile1 513ops/s 0.0mb/s 3.2ms/op 148us/op-cpu
632: 261.350:
IO Summary: 403194 ops 6664.5 ops/s, (1025/1025 r/w) 32.5mb/s, 406us
cpu/op, 7.5ms latency
632: 261.350: Shutting down processes
filebench>
v440, snv_44
bash-3.00# zpool status
pool: zfs_raid10_12disks
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
zfs_raid10_12disks ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t16d0 ONLINE 0 0 0
c2t17d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t18d0 ONLINE 0 0 0
c2t19d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t20d0 ONLINE 0 0 0
c2t21d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t22d0 ONLINE 0 0 0
c2t23d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t24d0 ONLINE 0 0 0
c2t25d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t26d0 ONLINE 0 0 0
c2t27d0 ONLINE 0 0 0
errors: No known data errors
bash-3.00#
bash-3.00# /opt/filebench/bin/sparcv9/filebench
filebench> load varmail
393: 6.283: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully
loaded
393: 6.283: Usage: set $dir=<dir>
393: 6.283: set $filesize=<size> defaults to 16384
393: 6.283: set $nfiles=<value> defaults to 1000
393: 6.283: set $nthreads=<value> defaults to 16
393: 6.283: set $meaniosize=<value> defaults to 16384
393: 6.284: set $meandirwidth=<size> defaults to 1000000
393: 6.284: (sets mean dir width and dir depth is calculated as log (width,
nfiles)
393: 6.284: dirdepth therefore defaults to dir depth of 1 as in postmark
393: 6.284: set $meandir lower to increase depth beyond 1 if desired)
393: 6.284:
393: 6.284: run runtime (e.g. run 60)
393: 6.284: syntax error, token expected on line 51
filebench> set $dir=/zfs_raid10_12disks/t1/
filebench> run 60
393: 18.766: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
393: 18.767: Creating fileset bigfileset...
393: 23.020: Preallocated 812 of 1000 of fileset bigfileset in 5 seconds
393: 23.020: Creating/pre-allocating files
393: 23.020: Starting 1 filereader instances
394: 24.030: Starting 16 filereaderthread threads
393: 27.040: Running...
393: 87.440: Run took 60 seconds...
393: 87.453: Per-Operation Breakdown
closefile4 711ops/s 0.0mb/s 0.0ms/op 9us/op-cpu
readfile4 711ops/s 11.4mb/s 0.1ms/op 62us/op-cpu
openfile4 711ops/s 0.0mb/s 0.1ms/op 65us/op-cpu
closefile3 711ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 711ops/s 0.0mb/s 10.0ms/op 148us/op-cpu
appendfilerand3 711ops/s 11.1mb/s 0.1ms/op 129us/op-cpu
readfile3 711ops/s 11.6mb/s 0.1ms/op 63us/op-cpu
openfile3 711ops/s 0.0mb/s 0.1ms/op 65us/op-cpu
closefile2 711ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 711ops/s 0.0mb/s 10.0ms/op 115us/op-cpu
appendfilerand2 711ops/s 11.1mb/s 0.1ms/op 97us/op-cpu
createfile2 711ops/s 0.0mb/s 0.2ms/op 163us/op-cpu
deletefile1 711ops/s 0.0mb/s 0.1ms/op 89us/op-cpu
393: 87.454:
IO Summary: 558331 ops 9244.1 ops/s, (1422/1422 r/w) 45.2mb/s, 312us
cpu/op, 5.2ms latency
393: 87.454: Shutting down processes
filebench> run 60
393: 118.054: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
393: 118.108: Removed any existing fileset bigfileset in 1 seconds
393: 118.108: Creating fileset bigfileset...
393: 122.619: Preallocated 786 of 1000 of fileset bigfileset in 5 seconds
393: 122.619: Creating/pre-allocating files
393: 122.619: Starting 1 filereader instances
401: 123.630: Starting 16 filereaderthread threads
393: 126.640: Running...
393: 187.040: Run took 60 seconds...
393: 187.053: Per-Operation Breakdown
closefile4 685ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 685ops/s 11.1mb/s 0.1ms/op 62us/op-cpu
openfile4 685ops/s 0.0mb/s 0.1ms/op 65us/op-cpu
closefile3 685ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 685ops/s 0.0mb/s 10.5ms/op 150us/op-cpu
appendfilerand3 685ops/s 10.7mb/s 0.1ms/op 124us/op-cpu
readfile3 685ops/s 11.1mb/s 0.1ms/op 60us/op-cpu
openfile3 685ops/s 0.0mb/s 0.1ms/op 65us/op-cpu
closefile2 685ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 685ops/s 0.0mb/s 10.4ms/op 113us/op-cpu
appendfilerand2 685ops/s 10.7mb/s 0.1ms/op 93us/op-cpu
createfile2 685ops/s 0.0mb/s 0.2ms/op 156us/op-cpu
deletefile1 685ops/s 0.0mb/s 0.1ms/op 89us/op-cpu
393: 187.054:
IO Summary: 537542 ops 8899.9 ops/s, (1369/1369 r/w) 43.5mb/s, 307us
cpu/op, 5.4ms latency
393: 187.054: Shutting down processes
filebench>
This message posted from opensolaris.org
Eric Schrock
2006-Aug-07 15:53 UTC
[zfs-discuss] 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Cool stuff, Robert. It''d be interesting to see some RAID-Z (single- and double-parity) benchmarks as well, but understandably this takes time ;-) The first thing to note is that the current Nevada bits have a number of performance fixes not in S10u2, so there''s going to be a natural bias when comparing ZFS to ZFS between these systems. Second, you may be able to get more performance from the ZFS filesystem on the HW lun by tweaking the max pending # of reqeusts. One thing we''ve found is that ZFS currently has a hardcoded limit of how many outstanding requests to send to the underlying vdev (35). This works well for most single devices, but large arrays can actually handle more, and we end up leaving some performance on the floor. Currently the only way to tweak this variable is through ''mdb -kw''. Try something like: # mdb -kw> ::spa -vADDR STATE NAME ffffffff82ef4140 ACTIVE pool ADDR STATE AUX DESCRIPTION ffffffff9677a1c0 HEALTHY - root ffffffff9678d080 HEALTHY - raidz ffffffff9678db80 HEALTHY - /dev/dsk/c2d0s0 ffffffff96778640 HEALTHY - /dev/dsk/c3d0s0 ffffffff967780c0 HEALTHY - /dev/dsk/c4d0s0 ffffffff9e495780 HEALTHY - /dev/dsk/c5d0s0> ffffffff9678db80::print -a vdev_t vdev_queue.vq_max_pendingffffffff9678df00 vdev_queue.vq_max_pending = 0x23> ffffffff9678df00/Z0t600xffffffff9678df00: 0x23 = 0x3c This will change the max # of pending requests for the disk to 60, instead of 35. We''re trying to figure out how to tweak and/or dynamically detect the best value here, so any more data would be useful. - Eric On Mon, Aug 07, 2006 at 08:22:24AM -0700, Robert Milkowski wrote:> Hi. > > 3510 with two HW controllers, configured on LUN in RAID-10 using 12 > disks in head unit (FC-AL 73GB 15K disks). Optimization set to random, > stripe size 32KB. Connected to v440 using two links, however in tests > only one link was used (no MPxIO). > > I used filebench and varmail test with default parameters and run for > 60s, test was run twice. > > System is S10U2 with all available patches (all support patches), kernel -18. > > > ZFS filesystem on HW lun with atime=off: > > IO Summary: 499078 ops 8248.0 ops/s, (1269/1269 r/w) 40.6mb/s, 314us cpu/op, 6.0ms latency > IO Summary: 503112 ops 8320.2 ops/s, (1280/1280 r/w) 41.0mb/s, 296us cpu/op, 5.9ms latency > > Now the same LUN but ZFS was destroyed and UFS filesystem was created. > UFS filesystem on HW lun with maxcontig=24 and noatime: > > IO Summary: 401671 ops 6638.2 ops/s, (1021/1021 r/w) 32.7mb/s, 404us cpu/op, 7.5ms latency > IO Summary: 403194 ops 6664.5 ops/s, (1025/1025 r/w) 32.5mb/s, 406us cpu/op, 7.5ms latency > > > > Now another v440 server (the same config) with snv_44, connected several 3510 JBODS on two FC-loops however only one loop was used (no MPxIO). The same disks (73GB FC-AL 15K). > > ZFS filesystem with atime=off with ZFS raid-10 using 12 disks from one enclosure: > > IO Summary: 558331 ops 9244.1 ops/s, (1422/1422 r/w) 45.2mb/s, 312us cpu/op, 5.2ms latency > IO Summary: 537542 ops 8899.9 ops/s, (1369/1369 r/w) 43.5mb/s, 307us cpu/op, 5.4ms latency-- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Robert Milkowski
2006-Aug-07 16:16 UTC
[zfs-discuss] 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hello Eric,
Monday, August 7, 2006, 5:53:38 PM, you wrote:
ES> Cool stuff, Robert. It''d be interesting to see some RAID-Z
(single- and
ES> double-parity) benchmarks as well, but understandably this takes time
ES> ;-)
I intend to test raid-z. Not sure there''ll be enough time for raidz2.
ES> The first thing to note is that the current Nevada bits have a number of
ES> performance fixes not in S10u2, so there''s going to be a natural
bias
ES> when comparing ZFS to ZFS between these systems.
Yeah, I know. That''s why I put UFS on HW config also to see if ZFS
doesn''t underperform on U2.
ES> Second, you may be able to get more performance from the ZFS filesystem
ES> on the HW lun by tweaking the max pending # of reqeusts. One thing
ES> we''ve found is that ZFS currently has a hardcoded limit of how
many
ES> outstanding requests to send to the underlying vdev (35). This works
ES> well for most single devices, but large arrays can actually handle more,
ES> and we end up leaving some performance on the floor. Currently the only
ES> way to tweak this variable is through ''mdb -kw''. Try
something like:
Well, strange - I did try with value of 1, 60 and 256. And basically I
get the same results from varmail tests.
--
Best regards,
Robert mailto:rmilkowski at task.gda.pl
http://milek.blogspot.com
Luke Lonergan
2006-Aug-07 16:27 UTC
[zfs-discuss] 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Niiiice! Hooray ZFS!
- Luke
Sent from my GoodLink synchronized handheld (www.good.com)
-----Original Message-----
From: Robert Milkowski [mailto:milek at task.gda.pl]
Sent: Monday, August 07, 2006 11:25 AM Eastern Standard Time
To: zfs-discuss at opensolaris.org
Subject: [zfs-discuss] 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hi.
3510 with two HW controllers, configured on LUN in RAID-10 using 12 disks in
head unit (FC-AL 73GB 15K disks). Optimization set to random, stripe size 32KB.
Connected to v440 using two links, however in tests only one link was used (no
MPxIO).
I used filebench and varmail test with default parameters and run for 60s, test
was run twice.
System is S10U2 with all available patches (all support patches), kernel -18.
ZFS filesystem on HW lun with atime=off:
IO Summary: 499078 ops 8248.0 ops/s, (1269/1269 r/w) 40.6mb/s, 314us
cpu/op, 6.0ms latency
IO Summary: 503112 ops 8320.2 ops/s, (1280/1280 r/w) 41.0mb/s, 296us
cpu/op, 5.9ms latency
Now the same LUN but ZFS was destroyed and UFS filesystem was created.
UFS filesystem on HW lun with maxcontig=24 and noatime:
IO Summary: 401671 ops 6638.2 ops/s, (1021/1021 r/w) 32.7mb/s, 404us
cpu/op, 7.5ms latency
IO Summary: 403194 ops 6664.5 ops/s, (1025/1025 r/w) 32.5mb/s, 406us
cpu/op, 7.5ms latency
Now another v440 server (the same config) with snv_44, connected several 3510
JBODS on two FC-loops however only one loop was used (no MPxIO). The same disks
(73GB FC-AL 15K).
ZFS filesystem with atime=off with ZFS raid-10 using 12 disks from one
enclosure:
IO Summary: 558331 ops 9244.1 ops/s, (1422/1422 r/w) 45.2mb/s, 312us
cpu/op, 5.2ms latency
IO Summary: 537542 ops 8899.9 ops/s, (1369/1369 r/w) 43.5mb/s, 307us
cpu/op, 5.4ms latency
### details ####
$ cat zfs-benhc.txt
v440, Generic_118833-18
filebench> set $dir=/se3510_hw_raid10_12disks/t1/
filebench> run 60
582: 42.107: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
582: 42.108: Creating fileset bigfileset...
582: 45.262: Preallocated 812 of 1000 of fileset bigfileset in 4 seconds
582: 45.262: Creating/pre-allocating files
582: 45.262: Starting 1 filereader instances
586: 46.268: Starting 16 filereaderthread threads
582: 49.278: Running...
582: 109.787: Run took 60 seconds...
582: 109.801: Per-Operation Breakdown
closefile4 634ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 634ops/s 10.3mb/s 0.1ms/op 65us/op-cpu
openfile4 634ops/s 0.0mb/s 0.1ms/op 63us/op-cpu
closefile3 634ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 634ops/s 0.0mb/s 11.3ms/op 150us/op-cpu
appendfilerand3 635ops/s 9.9mb/s 0.1ms/op 132us/op-cpu
readfile3 635ops/s 10.4mb/s 0.1ms/op 66us/op-cpu
openfile3 635ops/s 0.0mb/s 0.1ms/op 63us/op-cpu
closefile2 635ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 635ops/s 0.0mb/s 11.9ms/op 137us/op-cpu
appendfilerand2 635ops/s 9.9mb/s 0.1ms/op 94us/op-cpu
createfile2 634ops/s 0.0mb/s 0.2ms/op 163us/op-cpu
deletefile1 634ops/s 0.0mb/s 0.1ms/op 86us/op-cpu
582: 109.801:
IO Summary: 499078 ops 8248.0 ops/s, (1269/1269 r/w) 40.6mb/s, 314us
cpu/op, 6.0ms latency
582: 109.801: Shutting down processes
filebench>
filebench> run 60
582: 190.655: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
582: 190.720: Removed any existing fileset bigfileset in 1 seconds
582: 190.720: Creating fileset bigfileset...
582: 193.259: Preallocated 786 of 1000 of fileset bigfileset in 3 seconds
582: 193.259: Creating/pre-allocating files
582: 193.259: Starting 1 filereader instances
591: 194.268: Starting 16 filereaderthread threads
582: 197.278: Running...
582: 257.748: Run took 60 seconds...
582: 257.761: Per-Operation Breakdown
closefile4 640ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 640ops/s 10.5mb/s 0.1ms/op 64us/op-cpu
openfile4 640ops/s 0.0mb/s 0.1ms/op 63us/op-cpu
closefile3 640ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 640ops/s 0.0mb/s 11.1ms/op 147us/op-cpu
appendfilerand3 640ops/s 10.0mb/s 0.1ms/op 124us/op-cpu
readfile3 640ops/s 10.5mb/s 0.1ms/op 67us/op-cpu
openfile3 640ops/s 0.0mb/s 0.1ms/op 63us/op-cpu
closefile2 640ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 640ops/s 0.0mb/s 11.9ms/op 139us/op-cpu
appendfilerand2 640ops/s 10.0mb/s 0.1ms/op 89us/op-cpu
createfile2 640ops/s 0.0mb/s 0.2ms/op 157us/op-cpu
deletefile1 640ops/s 0.0mb/s 0.1ms/op 87us/op-cpu
582: 257.761:
IO Summary: 503112 ops 8320.2 ops/s, (1280/1280 r/w) 41.0mb/s, 296us
cpu/op, 5.9ms latency
582: 257.761: Shutting down processes
filebench>
bash-3.00# zpool destroy se3510_hw_raid10_12disks
bash-3.00# newfs -C 24 /dev/rdsk/c3t40d0s0
newfs: construct a new file system /dev/rdsk/c3t40d0s0: (y/n)? y
Warning: 4164 sector(s) in last cylinder unallocated
/dev/rdsk/c3t40d0s0: 857083836 sectors in 139500 cylinders of 48 tracks, 128
sectors
418498.0MB in 8719 cyl groups (16 c/g, 48.00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
...............................................................................
...............................................................................
................
super-block backups for last 10 cylinder groups at:
856130208, 856228640, 856327072, 856425504, 856523936, 856622368, 856720800,
856819232, 856917664, 857016096
bash-3.00# mount -o noatime /dev/dsk/c3t40d0s0 /mnt/
bash-3.00#
bash-3.00# /opt/filebench/bin/sparcv9/filebench
filebench> load varmail
632: 2.758: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully
loaded
632: 2.759: Usage: set $dir=<dir>
632: 2.759: set $filesize=<size> defaults to 16384
632: 2.759: set $nfiles=<value> defaults to 1000
632: 2.759: set $nthreads=<value> defaults to 16
632: 2.759: set $meaniosize=<value> defaults to 16384
632: 2.759: set $meandirwidth=<size> defaults to 1000000
632: 2.759: (sets mean dir width and dir depth is calculated as log (width,
nfiles)
632: 2.759: dirdepth therefore defaults to dir depth of 1 as in postmark
632: 2.759: set $meandir lower to increase depth beyond 1 if desired)
632: 2.759:
632: 2.759: run runtime (e.g. run 60)
632: 2.759: syntax error, token expected on line 51
filebench> set $dir=/mnt/
filebench> run 60
632: 7.699: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
632: 7.722: Creating fileset bigfileset...
632: 10.611: Preallocated 812 of 1000 of fileset bigfileset in 3 seconds
632: 10.611: Creating/pre-allocating files
632: 10.611: Starting 1 filereader instances
633: 11.615: Starting 16 filereaderthread threads
632: 14.625: Running...
632: 75.135: Run took 60 seconds...
632: 75.149: Per-Operation Breakdown
closefile4 511ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 511ops/s 8.4mb/s 0.1ms/op 65us/op-cpu
openfile4 511ops/s 0.0mb/s 0.0ms/op 37us/op-cpu
closefile3 511ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile3 511ops/s 0.0mb/s 9.7ms/op 168us/op-cpu
appendfilerand3 511ops/s 8.0mb/s 2.6ms/op 190us/op-cpu
readfile3 511ops/s 8.3mb/s 0.1ms/op 65us/op-cpu
openfile3 511ops/s 0.0mb/s 0.0ms/op 37us/op-cpu
closefile2 511ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile2 511ops/s 0.0mb/s 8.4ms/op 152us/op-cpu
appendfilerand2 511ops/s 8.0mb/s 1.7ms/op 170us/op-cpu
createfile2 511ops/s 0.0mb/s 4.3ms/op 297us/op-cpu
deletefile1 511ops/s 0.0mb/s 3.1ms/op 145us/op-cpu
632: 75.149:
IO Summary: 401671 ops 6638.2 ops/s, (1021/1021 r/w) 32.7mb/s, 404us
cpu/op, 7.5ms latency
632: 75.149: Shutting down processes
filebench> run 60
632: 193.974: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
632: 194.874: Removed any existing fileset bigfileset in 1 seconds
632: 194.875: Creating fileset bigfileset...
632: 196.817: Preallocated 786 of 1000 of fileset bigfileset in 2 seconds
632: 196.817: Creating/pre-allocating files
632: 196.817: Starting 1 filereader instances
636: 197.825: Starting 16 filereaderthread threads
632: 200.835: Running...
632: 261.335: Run took 60 seconds...
632: 261.350: Per-Operation Breakdown
closefile4 513ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 513ops/s 8.2mb/s 0.1ms/op 64us/op-cpu
openfile4 513ops/s 0.0mb/s 0.0ms/op 38us/op-cpu
closefile3 513ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile3 513ops/s 0.0mb/s 9.7ms/op 169us/op-cpu
appendfilerand3 513ops/s 8.0mb/s 2.7ms/op 189us/op-cpu
readfile3 513ops/s 8.3mb/s 0.1ms/op 65us/op-cpu
openfile3 513ops/s 0.0mb/s 0.0ms/op 38us/op-cpu
closefile2 513ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile2 513ops/s 0.0mb/s 8.4ms/op 154us/op-cpu
appendfilerand2 513ops/s 8.0mb/s 1.7ms/op 165us/op-cpu
createfile2 513ops/s 0.0mb/s 4.2ms/op 301us/op-cpu
deletefile1 513ops/s 0.0mb/s 3.2ms/op 148us/op-cpu
632: 261.350:
IO Summary: 403194 ops 6664.5 ops/s, (1025/1025 r/w) 32.5mb/s, 406us
cpu/op, 7.5ms latency
632: 261.350: Shutting down processes
filebench>
v440, snv_44
bash-3.00# zpool status
pool: zfs_raid10_12disks
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
zfs_raid10_12disks ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t16d0 ONLINE 0 0 0
c2t17d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t18d0 ONLINE 0 0 0
c2t19d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t20d0 ONLINE 0 0 0
c2t21d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t22d0 ONLINE 0 0 0
c2t23d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t24d0 ONLINE 0 0 0
c2t25d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t26d0 ONLINE 0 0 0
c2t27d0 ONLINE 0 0 0
errors: No known data errors
bash-3.00#
bash-3.00# /opt/filebench/bin/sparcv9/filebench
filebench> load varmail
393: 6.283: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully
loaded
393: 6.283: Usage: set $dir=<dir>
393: 6.283: set $filesize=<size> defaults to 16384
393: 6.283: set $nfiles=<value> defaults to 1000
393: 6.283: set $nthreads=<value> defaults to 16
393: 6.283: set $meaniosize=<value> defaults to 16384
393: 6.284: set $meandirwidth=<size> defaults to 1000000
393: 6.284: (sets mean dir width and dir depth is calculated as log (width,
nfiles)
393: 6.284: dirdepth therefore defaults to dir depth of 1 as in postmark
393: 6.284: set $meandir lower to increase depth beyond 1 if desired)
393: 6.284:
393: 6.284: run runtime (e.g. run 60)
393: 6.284: syntax error, token expected on line 51
filebench> set $dir=/zfs_raid10_12disks/t1/
filebench> run 60
393: 18.766: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
393: 18.767: Creating fileset bigfileset...
393: 23.020: Preallocated 812 of 1000 of fileset bigfileset in 5 seconds
393: 23.020: Creating/pre-allocating files
393: 23.020: Starting 1 filereader instances
394: 24.030: Starting 16 filereaderthread threads
393: 27.040: Running...
393: 87.440: Run took 60 seconds...
393: 87.453: Per-Operation Breakdown
closefile4 711ops/s 0.0mb/s 0.0ms/op 9us/op-cpu
readfile4 711ops/s 11.4mb/s 0.1ms/op 62us/op-cpu
openfile4 711ops/s 0.0mb/s 0.1ms/op 65us/op-cpu
closefile3 711ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 711ops/s 0.0mb/s 10.0ms/op 148us/op-cpu
appendfilerand3 711ops/s 11.1mb/s 0.1ms/op 129us/op-cpu
readfile3 711ops/s 11.6mb/s 0.1ms/op 63us/op-cpu
openfile3 711ops/s 0.0mb/s 0.1ms/op 65us/op-cpu
closefile2 711ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 711ops/s 0.0mb/s 10.0ms/op 115us/op-cpu
appendfilerand2 711ops/s 11.1mb/s 0.1ms/op 97us/op-cpu
createfile2 711ops/s 0.0mb/s 0.2ms/op 163us/op-cpu
deletefile1 711ops/s 0.0mb/s 0.1ms/op 89us/op-cpu
393: 87.454:
IO Summary: 558331 ops 9244.1 ops/s, (1422/1422 r/w) 45.2mb/s, 312us
cpu/op, 5.2ms latency
393: 87.454: Shutting down processes
filebench> run 60
393: 118.054: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
393: 118.108: Removed any existing fileset bigfileset in 1 seconds
393: 118.108: Creating fileset bigfileset...
393: 122.619: Preallocated 786 of 1000 of fileset bigfileset in 5 seconds
393: 122.619: Creating/pre-allocating files
393: 122.619: Starting 1 filereader instances
401: 123.630: Starting 16 filereaderthread threads
393: 126.640: Running...
393: 187.040: Run took 60 seconds...
393: 187.053: Per-Operation Breakdown
closefile4 685ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 685ops/s 11.1mb/s 0.1ms/op 62us/op-cpu
openfile4 685ops/s 0.0mb/s 0.1ms/op 65us/op-cpu
closefile3 685ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 685ops/s 0.0mb/s 10.5ms/op 150us/op-cpu
appendfilerand3 685ops/s 10.7mb/s 0.1ms/op 124us/op-cpu
readfile3 685ops/s 11.1mb/s 0.1ms/op 60us/op-cpu
openfile3 685ops/s 0.0mb/s 0.1ms/op 65us/op-cpu
closefile2 685ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 685ops/s 0.0mb/s 10.4ms/op 113us/op-cpu
appendfilerand2 685ops/s 10.7mb/s 0.1ms/op 93us/op-cpu
createfile2 685ops/s 0.0mb/s 0.2ms/op 156us/op-cpu
deletefile1 685ops/s 0.0mb/s 0.1ms/op 89us/op-cpu
393: 187.054:
IO Summary: 537542 ops 8899.9 ops/s, (1369/1369 r/w) 43.5mb/s, 307us
cpu/op, 5.4ms latency
393: 187.054: Shutting down processes
filebench>
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Eric Schrock
2006-Aug-07 16:30 UTC
[zfs-discuss] 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
On Mon, Aug 07, 2006 at 06:16:12PM +0200, Robert Milkowski wrote:> > ES> Second, you may be able to get more performance from the ZFS filesystem > ES> on the HW lun by tweaking the max pending # of reqeusts. One thing > ES> we''ve found is that ZFS currently has a hardcoded limit of how many > ES> outstanding requests to send to the underlying vdev (35). This works > ES> well for most single devices, but large arrays can actually handle more, > ES> and we end up leaving some performance on the floor. Currently the only > ES> way to tweak this variable is through ''mdb -kw''. Try something like: > > Well, strange - I did try with value of 1, 60 and 256. And basically I > get the same results from varmail tests. >Well that''s good data, too. It means that this isn''t an impediment for this particular test. It was just a shot in the dark... - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Richard Elling
2006-Aug-07 16:54 UTC
[zfs-discuss] 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hi Robert, thanks for the data. Please clarify one thing for me. In the case of the HW raid, was there just one LUN? Or was it 12 LUNs? -- richard Robert Milkowski wrote:> Hi. > > 3510 with two HW controllers, configured on LUN in RAID-10 using 12 disks in head unit (FC-AL 73GB 15K disks). Optimization set to random, stripe size 32KB. Connected to v440 using two links, however in tests only one link was used (no MPxIO). > > I used filebench and varmail test with default parameters and run for 60s, test was run twice. > > System is S10U2 with all available patches (all support patches), kernel -18. > > > ZFS filesystem on HW lun with atime=off: > > IO Summary: 499078 ops 8248.0 ops/s, (1269/1269 r/w) 40.6mb/s, 314us cpu/op, 6.0ms latency > IO Summary: 503112 ops 8320.2 ops/s, (1280/1280 r/w) 41.0mb/s, 296us cpu/op, 5.9ms latency > > Now the same LUN but ZFS was destroyed and UFS filesystem was created. > UFS filesystem on HW lun with maxcontig=24 and noatime: > > IO Summary: 401671 ops 6638.2 ops/s, (1021/1021 r/w) 32.7mb/s, 404us cpu/op, 7.5ms latency > IO Summary: 403194 ops 6664.5 ops/s, (1025/1025 r/w) 32.5mb/s, 406us cpu/op, 7.5ms latency > > > > Now another v440 server (the same config) with snv_44, connected several 3510 JBODS on two FC-loops however only one loop was used (no MPxIO). The same disks (73GB FC-AL 15K). > > ZFS filesystem with atime=off with ZFS raid-10 using 12 disks from one enclosure: > > IO Summary: 558331 ops 9244.1 ops/s, (1422/1422 r/w) 45.2mb/s, 312us cpu/op, 5.2ms latency > IO Summary: 537542 ops 8899.9 ops/s, (1369/1369 r/w) 43.5mb/s, 307us cpu/op, 5.4ms latency > > > > > > > > ### details #### > > $ cat zfs-benhc.txt > > > v440, Generic_118833-18 > > filebench> set $dir=/se3510_hw_raid10_12disks/t1/ > filebench> run 60 > 582: 42.107: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth = 0.5, mbytes=15 > 582: 42.108: Creating fileset bigfileset... > 582: 45.262: Preallocated 812 of 1000 of fileset bigfileset in 4 seconds > 582: 45.262: Creating/pre-allocating files > 582: 45.262: Starting 1 filereader instances > 586: 46.268: Starting 16 filereaderthread threads > 582: 49.278: Running... > 582: 109.787: Run took 60 seconds... > 582: 109.801: Per-Operation Breakdown > closefile4 634ops/s 0.0mb/s 0.0ms/op 8us/op-cpu > readfile4 634ops/s 10.3mb/s 0.1ms/op 65us/op-cpu > openfile4 634ops/s 0.0mb/s 0.1ms/op 63us/op-cpu > closefile3 634ops/s 0.0mb/s 0.0ms/op 11us/op-cpu > fsyncfile3 634ops/s 0.0mb/s 11.3ms/op 150us/op-cpu > appendfilerand3 635ops/s 9.9mb/s 0.1ms/op 132us/op-cpu > readfile3 635ops/s 10.4mb/s 0.1ms/op 66us/op-cpu > openfile3 635ops/s 0.0mb/s 0.1ms/op 63us/op-cpu > closefile2 635ops/s 0.0mb/s 0.0ms/op 11us/op-cpu > fsyncfile2 635ops/s 0.0mb/s 11.9ms/op 137us/op-cpu > appendfilerand2 635ops/s 9.9mb/s 0.1ms/op 94us/op-cpu > createfile2 634ops/s 0.0mb/s 0.2ms/op 163us/op-cpu > deletefile1 634ops/s 0.0mb/s 0.1ms/op 86us/op-cpu > > 582: 109.801: > IO Summary: 499078 ops 8248.0 ops/s, (1269/1269 r/w) 40.6mb/s, 314us cpu/op, 6.0ms latency > 582: 109.801: Shutting down processes > filebench> > > filebench> run 60 > 582: 190.655: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth = 0.5, mbytes=15 > 582: 190.720: Removed any existing fileset bigfileset in 1 seconds > 582: 190.720: Creating fileset bigfileset... > 582: 193.259: Preallocated 786 of 1000 of fileset bigfileset in 3 seconds > 582: 193.259: Creating/pre-allocating files > 582: 193.259: Starting 1 filereader instances > 591: 194.268: Starting 16 filereaderthread threads > 582: 197.278: Running... > 582: 257.748: Run took 60 seconds... > 582: 257.761: Per-Operation Breakdown > closefile4 640ops/s 0.0mb/s 0.0ms/op 8us/op-cpu > readfile4 640ops/s 10.5mb/s 0.1ms/op 64us/op-cpu > openfile4 640ops/s 0.0mb/s 0.1ms/op 63us/op-cpu > closefile3 640ops/s 0.0mb/s 0.0ms/op 11us/op-cpu > fsyncfile3 640ops/s 0.0mb/s 11.1ms/op 147us/op-cpu > appendfilerand3 640ops/s 10.0mb/s 0.1ms/op 124us/op-cpu > readfile3 640ops/s 10.5mb/s 0.1ms/op 67us/op-cpu > openfile3 640ops/s 0.0mb/s 0.1ms/op 63us/op-cpu > closefile2 640ops/s 0.0mb/s 0.0ms/op 11us/op-cpu > fsyncfile2 640ops/s 0.0mb/s 11.9ms/op 139us/op-cpu > appendfilerand2 640ops/s 10.0mb/s 0.1ms/op 89us/op-cpu > createfile2 640ops/s 0.0mb/s 0.2ms/op 157us/op-cpu > deletefile1 640ops/s 0.0mb/s 0.1ms/op 87us/op-cpu > > 582: 257.761: > IO Summary: 503112 ops 8320.2 ops/s, (1280/1280 r/w) 41.0mb/s, 296us cpu/op, 5.9ms latency > 582: 257.761: Shutting down processes > filebench> > > > > > bash-3.00# zpool destroy se3510_hw_raid10_12disks > bash-3.00# newfs -C 24 /dev/rdsk/c3t40d0s0 > newfs: construct a new file system /dev/rdsk/c3t40d0s0: (y/n)? y > Warning: 4164 sector(s) in last cylinder unallocated > /dev/rdsk/c3t40d0s0: 857083836 sectors in 139500 cylinders of 48 tracks, 128 sectors > 418498.0MB in 8719 cyl groups (16 c/g, 48.00MB/g, 5824 i/g) > super-block backups (for fsck -F ufs -o b=#) at: > 32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920, > Initializing cylinder groups: > ............................................................................... > ............................................................................... > ................ > super-block backups for last 10 cylinder groups at: > 856130208, 856228640, 856327072, 856425504, 856523936, 856622368, 856720800, > 856819232, 856917664, 857016096 > bash-3.00# mount -o noatime /dev/dsk/c3t40d0s0 /mnt/ > bash-3.00# > > > bash-3.00# /opt/filebench/bin/sparcv9/filebench > filebench> load varmail > 632: 2.758: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully loaded > 632: 2.759: Usage: set $dir=<dir> > 632: 2.759: set $filesize=<size> defaults to 16384 > 632: 2.759: set $nfiles=<value> defaults to 1000 > 632: 2.759: set $nthreads=<value> defaults to 16 > 632: 2.759: set $meaniosize=<value> defaults to 16384 > 632: 2.759: set $meandirwidth=<size> defaults to 1000000 > 632: 2.759: (sets mean dir width and dir depth is calculated as log (width, nfiles) > 632: 2.759: dirdepth therefore defaults to dir depth of 1 as in postmark > 632: 2.759: set $meandir lower to increase depth beyond 1 if desired) > 632: 2.759: > 632: 2.759: run runtime (e.g. run 60) > 632: 2.759: syntax error, token expected on line 51 > filebench> set $dir=/mnt/ > filebench> run 60 > 632: 7.699: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth = 0.5, mbytes=15 > 632: 7.722: Creating fileset bigfileset... > 632: 10.611: Preallocated 812 of 1000 of fileset bigfileset in 3 seconds > 632: 10.611: Creating/pre-allocating files > 632: 10.611: Starting 1 filereader instances > 633: 11.615: Starting 16 filereaderthread threads > 632: 14.625: Running... > 632: 75.135: Run took 60 seconds... > 632: 75.149: Per-Operation Breakdown > closefile4 511ops/s 0.0mb/s 0.0ms/op 8us/op-cpu > readfile4 511ops/s 8.4mb/s 0.1ms/op 65us/op-cpu > openfile4 511ops/s 0.0mb/s 0.0ms/op 37us/op-cpu > closefile3 511ops/s 0.0mb/s 0.0ms/op 12us/op-cpu > fsyncfile3 511ops/s 0.0mb/s 9.7ms/op 168us/op-cpu > appendfilerand3 511ops/s 8.0mb/s 2.6ms/op 190us/op-cpu > readfile3 511ops/s 8.3mb/s 0.1ms/op 65us/op-cpu > openfile3 511ops/s 0.0mb/s 0.0ms/op 37us/op-cpu > closefile2 511ops/s 0.0mb/s 0.0ms/op 12us/op-cpu > fsyncfile2 511ops/s 0.0mb/s 8.4ms/op 152us/op-cpu > appendfilerand2 511ops/s 8.0mb/s 1.7ms/op 170us/op-cpu > createfile2 511ops/s 0.0mb/s 4.3ms/op 297us/op-cpu > deletefile1 511ops/s 0.0mb/s 3.1ms/op 145us/op-cpu > > 632: 75.149: > IO Summary: 401671 ops 6638.2 ops/s, (1021/1021 r/w) 32.7mb/s, 404us cpu/op, 7.5ms latency > 632: 75.149: Shutting down processes > filebench> run 60 > 632: 193.974: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth = 0.5, mbytes=15 > 632: 194.874: Removed any existing fileset bigfileset in 1 seconds > 632: 194.875: Creating fileset bigfileset... > 632: 196.817: Preallocated 786 of 1000 of fileset bigfileset in 2 seconds > 632: 196.817: Creating/pre-allocating files > 632: 196.817: Starting 1 filereader instances > 636: 197.825: Starting 16 filereaderthread threads > 632: 200.835: Running... > 632: 261.335: Run took 60 seconds... > 632: 261.350: Per-Operation Breakdown > closefile4 513ops/s 0.0mb/s 0.0ms/op 8us/op-cpu > readfile4 513ops/s 8.2mb/s 0.1ms/op 64us/op-cpu > openfile4 513ops/s 0.0mb/s 0.0ms/op 38us/op-cpu > closefile3 513ops/s 0.0mb/s 0.0ms/op 12us/op-cpu > fsyncfile3 513ops/s 0.0mb/s 9.7ms/op 169us/op-cpu > appendfilerand3 513ops/s 8.0mb/s 2.7ms/op 189us/op-cpu > readfile3 513ops/s 8.3mb/s 0.1ms/op 65us/op-cpu > openfile3 513ops/s 0.0mb/s 0.0ms/op 38us/op-cpu > closefile2 513ops/s 0.0mb/s 0.0ms/op 12us/op-cpu > fsyncfile2 513ops/s 0.0mb/s 8.4ms/op 154us/op-cpu > appendfilerand2 513ops/s 8.0mb/s 1.7ms/op 165us/op-cpu > createfile2 513ops/s 0.0mb/s 4.2ms/op 301us/op-cpu > deletefile1 513ops/s 0.0mb/s 3.2ms/op 148us/op-cpu > > 632: 261.350: > IO Summary: 403194 ops 6664.5 ops/s, (1025/1025 r/w) 32.5mb/s, 406us cpu/op, 7.5ms latency > 632: 261.350: Shutting down processes > filebench> > > > > > > > > > v440, snv_44 > > bash-3.00# zpool status > pool: zfs_raid10_12disks > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > zfs_raid10_12disks ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c2t16d0 ONLINE 0 0 0 > c2t17d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c2t18d0 ONLINE 0 0 0 > c2t19d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c2t20d0 ONLINE 0 0 0 > c2t21d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c2t22d0 ONLINE 0 0 0 > c2t23d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c2t24d0 ONLINE 0 0 0 > c2t25d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c2t26d0 ONLINE 0 0 0 > c2t27d0 ONLINE 0 0 0 > > errors: No known data errors > bash-3.00# > > > bash-3.00# /opt/filebench/bin/sparcv9/filebench > filebench> load varmail > 393: 6.283: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully loaded > 393: 6.283: Usage: set $dir=<dir> > 393: 6.283: set $filesize=<size> defaults to 16384 > 393: 6.283: set $nfiles=<value> defaults to 1000 > 393: 6.283: set $nthreads=<value> defaults to 16 > 393: 6.283: set $meaniosize=<value> defaults to 16384 > 393: 6.284: set $meandirwidth=<size> defaults to 1000000 > 393: 6.284: (sets mean dir width and dir depth is calculated as log (width, nfiles) > 393: 6.284: dirdepth therefore defaults to dir depth of 1 as in postmark > 393: 6.284: set $meandir lower to increase depth beyond 1 if desired) > 393: 6.284: > 393: 6.284: run runtime (e.g. run 60) > 393: 6.284: syntax error, token expected on line 51 > filebench> set $dir=/zfs_raid10_12disks/t1/ > filebench> run 60 > 393: 18.766: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth = 0.5, mbytes=15 > 393: 18.767: Creating fileset bigfileset... > 393: 23.020: Preallocated 812 of 1000 of fileset bigfileset in 5 seconds > 393: 23.020: Creating/pre-allocating files > 393: 23.020: Starting 1 filereader instances > 394: 24.030: Starting 16 filereaderthread threads > 393: 27.040: Running... > 393: 87.440: Run took 60 seconds... > 393: 87.453: Per-Operation Breakdown > closefile4 711ops/s 0.0mb/s 0.0ms/op 9us/op-cpu > readfile4 711ops/s 11.4mb/s 0.1ms/op 62us/op-cpu > openfile4 711ops/s 0.0mb/s 0.1ms/op 65us/op-cpu > closefile3 711ops/s 0.0mb/s 0.0ms/op 11us/op-cpu > fsyncfile3 711ops/s 0.0mb/s 10.0ms/op 148us/op-cpu > appendfilerand3 711ops/s 11.1mb/s 0.1ms/op 129us/op-cpu > readfile3 711ops/s 11.6mb/s 0.1ms/op 63us/op-cpu > openfile3 711ops/s 0.0mb/s 0.1ms/op 65us/op-cpu > closefile2 711ops/s 0.0mb/s 0.0ms/op 11us/op-cpu > fsyncfile2 711ops/s 0.0mb/s 10.0ms/op 115us/op-cpu > appendfilerand2 711ops/s 11.1mb/s 0.1ms/op 97us/op-cpu > createfile2 711ops/s 0.0mb/s 0.2ms/op 163us/op-cpu > deletefile1 711ops/s 0.0mb/s 0.1ms/op 89us/op-cpu > > 393: 87.454: > IO Summary: 558331 ops 9244.1 ops/s, (1422/1422 r/w) 45.2mb/s, 312us cpu/op, 5.2ms latency > 393: 87.454: Shutting down processes > filebench> run 60 > 393: 118.054: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth = 0.5, mbytes=15 > 393: 118.108: Removed any existing fileset bigfileset in 1 seconds > 393: 118.108: Creating fileset bigfileset... > 393: 122.619: Preallocated 786 of 1000 of fileset bigfileset in 5 seconds > 393: 122.619: Creating/pre-allocating files > 393: 122.619: Starting 1 filereader instances > 401: 123.630: Starting 16 filereaderthread threads > 393: 126.640: Running... > 393: 187.040: Run took 60 seconds... > 393: 187.053: Per-Operation Breakdown > closefile4 685ops/s 0.0mb/s 0.0ms/op 8us/op-cpu > readfile4 685ops/s 11.1mb/s 0.1ms/op 62us/op-cpu > openfile4 685ops/s 0.0mb/s 0.1ms/op 65us/op-cpu > closefile3 685ops/s 0.0mb/s 0.0ms/op 11us/op-cpu > fsyncfile3 685ops/s 0.0mb/s 10.5ms/op 150us/op-cpu > appendfilerand3 685ops/s 10.7mb/s 0.1ms/op 124us/op-cpu > readfile3 685ops/s 11.1mb/s 0.1ms/op 60us/op-cpu > openfile3 685ops/s 0.0mb/s 0.1ms/op 65us/op-cpu > closefile2 685ops/s 0.0mb/s 0.0ms/op 11us/op-cpu > fsyncfile2 685ops/s 0.0mb/s 10.4ms/op 113us/op-cpu > appendfilerand2 685ops/s 10.7mb/s 0.1ms/op 93us/op-cpu > createfile2 685ops/s 0.0mb/s 0.2ms/op 156us/op-cpu > deletefile1 685ops/s 0.0mb/s 0.1ms/op 89us/op-cpu > > 393: 187.054: > IO Summary: 537542 ops 8899.9 ops/s, (1369/1369 r/w) 43.5mb/s, 307us cpu/op, 5.4ms latency > 393: 187.054: Shutting down processes > filebench> > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
eric kustarz
2006-Aug-07 17:38 UTC
[zfs-discuss] 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
>ES> Second, you may be able to get more performance from the ZFS filesystem >ES> on the HW lun by tweaking the max pending # of reqeusts. One thing >ES> we''ve found is that ZFS currently has a hardcoded limit of how many >ES> outstanding requests to send to the underlying vdev (35). This works >ES> well for most single devices, but large arrays can actually handle more, >ES> and we end up leaving some performance on the floor. Currently the only >ES> way to tweak this variable is through ''mdb -kw''. Try something like: > >Well, strange - I did try with value of 1, 60 and 256. And basically I >get the same results from varmail tests. > > > >If vdev_reopen() is called then it will reset vq_max_pending to the vdev_knob''s default value. So you can set the "global" vq_max_pending in vdev_knob (though this affects all pools and all vdevs of each pool): #mdb -kw > vdev_knob::print .... Also, here''s a simple dscript (doesn''t work on U2 though due to a CTF bug, but works on nevada). This tells the average and distribution # of I/Os you tried doing. So if you find this under 35, then upping vq_max_pending won''t help. If however, you find you''re continually hitting the upper limit of 35, upping vq_max_pending should help. #!/usr/sbin/dtrace -s vdev_queue_io_to_issue:return /arg1 != NULL/ { @c["issued I/O"] = count(); } vdev_queue_io_to_issue:return /arg1 == NULL/ { @c["didn''t issue I/O"] = count(); } vdev_queue_io_to_issue:entry { @avgers["avg pending I/Os"] = avg(args[0]->vq_pending_tree.avl_numnodes); @lquant["quant pending I/Os"] = quantize(args[0]->vq_pending_tree.avl_numnodes); @c["total times tried to issue I/O"] = count(); } vdev_queue_io_to_issue:entry /args[0]->vq_pending_tree.avl_numnodes > 349/ { @avgers["avg pending I/Os > 349"] = avg(args[0]->vq_pending_tree.avl_numnodes); @quant["quant pending I/Os > 349"] = lquantize(args[0]->vq_pending_tree.avl_numnodes, 33, 1000, 1); @c["total times tried to issue I/O where > 349"] = count(); } /* bail after 5 minutes */ tick-300sec { exit(0); }
Robert Milkowski
2006-Aug-08 08:38 UTC
[zfs-discuss] 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hello Richard,
Monday, August 7, 2006, 6:54:37 PM, you wrote:
RE> Hi Robert, thanks for the data.
RE> Please clarify one thing for me.
RE> In the case of the HW raid, was there just one LUN? Or was it 12 LUNs?
Just one lun which was build on 3510 from 12 luns in raid-1(0).
--
Best regards,
Robert mailto:rmilkowski at task.gda.pl
http://milek.blogspot.com
Robert Milkowski
2006-Aug-08 14:13 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hi.
This time some RAID5/RAID-Z benchmarks.
This time I connected 3510 head unit with one link to the same server as 3510
JBODs are connected (using second link). snv_44 is used, server is v440.
I also tried changing max pending IO requests for HW raid5 lun and checked with
DTrace that larger value is really used - it is but it doesn''t change
benchmark numbers.
1. ZFS on HW RAID5 with 6 disks, atime=off
IO Summary: 444386 ops 7341.7 ops/s, (1129/1130 r/w) 36.1mb/s, 297us
cpu/op, 6.6ms latency
IO Summary: 438649 ops 7247.0 ops/s, (1115/1115 r/w) 35.5mb/s, 293us
cpu/op, 6.7ms latency
2. ZFS with software RAID-Z with 6 disks, atime=off
IO Summary: 457505 ops 7567.3 ops/s, (1164/1164 r/w) 37.2mb/s, 340us
cpu/op, 6.4ms latency
IO Summary: 457767 ops 7567.8 ops/s, (1164/1165 r/w) 36.9mb/s, 340us
cpu/op, 6.4ms latency
3. UFS on HW RAID5 with 6 disks, noatime
IO Summary: 62776 ops 1037.3 ops/s, (160/160 r/w) 5.5mb/s, 481us
cpu/op, 49.7ms latency
IO Summary: 63661 ops 1051.6 ops/s, (162/162 r/w) 5.4mb/s, 477us
cpu/op, 49.1ms latency
4. UFS on HW RAID5 with 6 disks, noatime, S10U2 + patches (the same filesystem
mounted as in 3)
IO Summary: 393167 ops 6503.1 ops/s, (1000/1001 r/w) 32.4mb/s, 405us
cpu/op, 7.5ms latency
IO Summary: 394525 ops 6521.2 ops/s, (1003/1003 r/w) 32.0mb/s, 407us
cpu/op, 7.7ms latency
5. ZFS with software RAID-Z with 6 disks, atime=off, S10U2 + patches (the same
disks as in test #2)
IO Summary: 461708 ops 7635.5 ops/s, (1175/1175 r/w) 37.4mb/s, 330us
cpu/op, 6.4ms latency
IO Summary: 457649 ops 7562.1 ops/s, (1163/1164 r/w) 37.0mb/s, 328us
cpu/op, 6.5ms latency
In this benchmark software raid-5 with ZFS (raid-z to be precise) gives a little
bit better performance than hardware raid-5. ZFS is also faster in both cases
(HW ans SW raid) than UFS on HW raid.
Something is wrong with UFS on snv_44 - the same ufs filesystem on s10U2 works
as expected.
ZFS on S10U2 in this benchmark gives the same results as on snv_44.
#### details ####
// c2t43d0 is a HW raid5 made of 6 disks
// array is configured for random IO''s
# zpool create HW_RAID5_6disks c2t43d0
#
# zpool create -f zfs_raid5_6disks raidz c3t16d0 c3t17d0 c3t18d0 c3t19d0 c3t20d0
c3t21d0
#
# zfs set atime=off zfs_raid5_6disks HW_RAID5_6disks
#
# zfs create HW_RAID5_6disks/t1
# zfs create zfs_raid5_6disks/t1
#
# /opt/filebench/bin/sparcv9/filebench
filebench> load varmail
450: 3.175: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully
loaded
450: 3.199: Usage: set $dir=<dir>
450: 3.199: set $filesize=<size> defaults to 16384
450: 3.199: set $nfiles=<value> defaults to 1000
450: 3.199: set $nthreads=<value> defaults to 16
450: 3.199: set $meaniosize=<value> defaults to 16384
450: 3.199: set $meandirwidth=<size> defaults to 1000000
450: 3.199: (sets mean dir width and dir depth is calculated as log (width,
nfiles)
450: 3.199: dirdepth therefore defaults to dir depth of 1 as in postmark
450: 3.199: set $meandir lower to increase depth beyond 1 if desired)
450: 3.199:
450: 3.199: run runtime (e.g. run 60)
450: 3.199: syntax error, token expected on line 51
filebench> set $dir=/HW_RAID5_6disks/t1
filebench> run 60
450: 13.320: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
450: 13.321: Creating fileset bigfileset...
450: 15.514: Preallocated 812 of 1000 of fileset bigfileset in 3 seconds
450: 15.515: Creating/pre-allocating files
450: 15.515: Starting 1 filereader instances
451: 16.525: Starting 16 filereaderthread threads
450: 19.535: Running...
450: 80.065: Run took 60 seconds...
450: 80.079: Per-Operation Breakdown
closefile4 565ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 565ops/s 9.2mb/s 0.1ms/op 60us/op-cpu
openfile4 565ops/s 0.0mb/s 0.1ms/op 64us/op-cpu
closefile3 565ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 565ops/s 0.0mb/s 12.9ms/op 147us/op-cpu
appendfilerand3 565ops/s 8.8mb/s 0.1ms/op 126us/op-cpu
readfile3 565ops/s 9.2mb/s 0.1ms/op 60us/op-cpu
openfile3 565ops/s 0.0mb/s 0.1ms/op 63us/op-cpu
closefile2 565ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 565ops/s 0.0mb/s 12.9ms/op 102us/op-cpu
appendfilerand2 565ops/s 8.8mb/s 0.1ms/op 92us/op-cpu
createfile2 565ops/s 0.0mb/s 0.2ms/op 154us/op-cpu
deletefile1 565ops/s 0.0mb/s 0.1ms/op 86us/op-cpu
450: 80.079:
IO Summary: 444386 ops 7341.7 ops/s, (1129/1130 r/w) 36.1mb/s, 297us
cpu/op, 6.6ms latency
450: 80.079: Shutting down processes
filebench> run 60
450: 115.945: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
450: 115.998: Removed any existing fileset bigfileset in 1 seconds
450: 115.998: Creating fileset bigfileset...
450: 118.049: Preallocated 786 of 1000 of fileset bigfileset in 3 seconds
450: 118.049: Creating/pre-allocating files
450: 118.049: Starting 1 filereader instances
454: 119.055: Starting 16 filereaderthread threads
450: 122.065: Running...
450: 182.595: Run took 60 seconds...
450: 182.608: Per-Operation Breakdown
closefile4 557ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 557ops/s 9.0mb/s 0.1ms/op 59us/op-cpu
openfile4 557ops/s 0.0mb/s 0.1ms/op 64us/op-cpu
closefile3 557ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 557ops/s 0.0mb/s 13.0ms/op 149us/op-cpu
appendfilerand3 558ops/s 8.7mb/s 0.1ms/op 120us/op-cpu
readfile3 558ops/s 9.0mb/s 0.1ms/op 59us/op-cpu
openfile3 558ops/s 0.0mb/s 0.1ms/op 64us/op-cpu
closefile2 558ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 558ops/s 0.0mb/s 13.2ms/op 100us/op-cpu
appendfilerand2 558ops/s 8.7mb/s 0.1ms/op 90us/op-cpu
createfile2 557ops/s 0.0mb/s 0.1ms/op 151us/op-cpu
deletefile1 557ops/s 0.0mb/s 0.1ms/op 86us/op-cpu
450: 182.609:
IO Summary: 438649 ops 7247.0 ops/s, (1115/1115 r/w) 35.5mb/s, 293us
cpu/op, 6.7ms latency
450: 182.609: Shutting down processes
filebench> quit
# /opt/filebench/bin/sparcv9/filebench
filebench> load varmail
458: 2.590: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully
loaded
458: 2.591: Usage: set $dir=<dir>
458: 2.591: set $filesize=<size> defaults to 16384
458: 2.591: set $nfiles=<value> defaults to 1000
458: 2.591: set $nthreads=<value> defaults to 16
458: 2.591: set $meaniosize=<value> defaults to 16384
458: 2.591: set $meandirwidth=<size> defaults to 1000000
458: 2.591: (sets mean dir width and dir depth is calculated as log (width,
nfiles)
458: 2.591: dirdepth therefore defaults to dir depth of 1 as in postmark
458: 2.592: set $meandir lower to increase depth beyond 1 if desired)
458: 2.592:
458: 2.592: run runtime (e.g. run 60)
458: 2.592: syntax error, token expected on line 51
filebench> set $dir=/zfs_raid5_6disks/t1
filebench> run 60
458: 9.251: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
458: 9.251: Creating fileset bigfileset...
458: 14.232: Preallocated 812 of 1000 of fileset bigfileset in 5 seconds
458: 14.232: Creating/pre-allocating files
458: 14.232: Starting 1 filereader instances
459: 15.235: Starting 16 filereaderthread threads
458: 18.245: Running...
458: 78.704: Run took 60 seconds...
458: 78.718: Per-Operation Breakdown
closefile4 582ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 582ops/s 9.6mb/s 0.1ms/op 62us/op-cpu
openfile4 582ops/s 0.0mb/s 0.1ms/op 67us/op-cpu
closefile3 582ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 582ops/s 0.0mb/s 12.4ms/op 206us/op-cpu
appendfilerand3 582ops/s 9.1mb/s 0.1ms/op 125us/op-cpu
readfile3 582ops/s 9.5mb/s 0.1ms/op 61us/op-cpu
openfile3 582ops/s 0.0mb/s 0.1ms/op 66us/op-cpu
closefile2 582ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 582ops/s 0.0mb/s 12.4ms/op 132us/op-cpu
appendfilerand2 582ops/s 9.1mb/s 0.1ms/op 94us/op-cpu
createfile2 582ops/s 0.0mb/s 0.2ms/op 160us/op-cpu
deletefile1 582ops/s 0.0mb/s 0.1ms/op 89us/op-cpu
458: 78.718:
IO Summary: 457505 ops 7567.3 ops/s, (1164/1164 r/w) 37.2mb/s, 340us
cpu/op, 6.4ms latency
458: 78.718: Shutting down processes
filebench> run 60
458: 98.396: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
458: 98.449: Removed any existing fileset bigfileset in 1 seconds
458: 98.449: Creating fileset bigfileset...
458: 103.837: Preallocated 786 of 1000 of fileset bigfileset in 6 seconds
458: 103.837: Creating/pre-allocating files
458: 103.837: Starting 1 filereader instances
468: 104.845: Starting 16 filereaderthread threads
458: 107.854: Running...
458: 168.345: Run took 60 seconds...
458: 168.358: Per-Operation Breakdown
closefile4 582ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 582ops/s 9.4mb/s 0.1ms/op 61us/op-cpu
openfile4 582ops/s 0.0mb/s 0.1ms/op 66us/op-cpu
closefile3 582ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 582ops/s 0.0mb/s 12.5ms/op 207us/op-cpu
appendfilerand3 582ops/s 9.1mb/s 0.1ms/op 124us/op-cpu
readfile3 582ops/s 9.4mb/s 0.1ms/op 61us/op-cpu
openfile3 582ops/s 0.0mb/s 0.1ms/op 66us/op-cpu
closefile2 582ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 582ops/s 0.0mb/s 12.3ms/op 132us/op-cpu
appendfilerand2 582ops/s 9.1mb/s 0.1ms/op 94us/op-cpu
createfile2 582ops/s 0.0mb/s 0.2ms/op 156us/op-cpu
deletefile1 582ops/s 0.0mb/s 0.1ms/op 89us/op-cpu
458: 168.359:
IO Summary: 457767 ops 7567.8 ops/s, (1164/1165 r/w) 36.9mb/s, 340us
cpu/op, 6.4ms latency
458: 168.359: Shutting down processes
filebench>
# zpool destroy HW_RAID5_6disks
# newfs -C 20 /dev/rdsk/c2t43d0s0
newfs: construct a new file system /dev/rdsk/c2t43d0s0: (y/n)? y
Warning: 68 sector(s) in last cylinder unallocated
/dev/rdsk/c2t43d0s0: 714233788 sectors in 116249 cylinders of 48 tracks, 128
sectors
348747.0MB in 7266 cyl groups (16 c/g, 48.00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
...............................................................................
..................................................................
super-block backups for last 10 cylinder groups at:
713296928, 713395360, 713493792, 713592224, 713690656, 713789088, 713887520,
713985952, 714084384, 714182816
#
# mount -o noatime /dev/dsk/c2t43d0s0 /mnt
#
# /opt/filebench/bin/sparcv9/filebench
filebench> load varmail
546: 2.573: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully
loaded
546: 2.573: Usage: set $dir=<dir>
546: 2.573: set $filesize=<size> defaults to 16384
546: 2.573: set $nfiles=<value> defaults to 1000
546: 2.574: set $nthreads=<value> defaults to 16
546: 2.574: set $meaniosize=<value> defaults to 16384
546: 2.574: set $meandirwidth=<size> defaults to 1000000
546: 2.574: (sets mean dir width and dir depth is calculated as log (width,
nfiles)
546: 2.574: dirdepth therefore defaults to dir depth of 1 as in postmark
546: 2.574: set $meandir lower to increase depth beyond 1 if desired)
546: 2.574:
546: 2.574: run runtime (e.g. run 60)
546: 2.574: syntax error, token expected on line 51
filebench> set $dir=/mnt
filebench> run 60
546: 22.095: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
546: 22.109: Creating fileset bigfileset...
546: 24.577: Preallocated 812 of 1000 of fileset bigfileset in 3 seconds
546: 24.577: Creating/pre-allocating files
546: 24.577: Starting 1 filereader instances
548: 25.584: Starting 16 filereaderthread threads
546: 28.594: Running...
546: 89.114: Run took 60 seconds...
546: 89.128: Per-Operation Breakdown
closefile4 80ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 80ops/s 1.5mb/s 0.1ms/op 76us/op-cpu
openfile4 80ops/s 0.0mb/s 0.0ms/op 39us/op-cpu
closefile3 80ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile3 80ops/s 0.0mb/s 29.2ms/op 107us/op-cpu
appendfilerand3 80ops/s 1.2mb/s 30.4ms/op 189us/op-cpu
readfile3 80ops/s 1.5mb/s 0.1ms/op 73us/op-cpu
openfile3 80ops/s 0.0mb/s 0.0ms/op 38us/op-cpu
closefile2 80ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile2 80ops/s 0.0mb/s 30.8ms/op 125us/op-cpu
appendfilerand2 80ops/s 1.2mb/s 22.6ms/op 173us/op-cpu
createfile2 80ops/s 0.0mb/s 37.2ms/op 224us/op-cpu
deletefile1 80ops/s 0.0mb/s 48.5ms/op 108us/op-cpu
546: 89.128:
IO Summary: 62776 ops 1037.3 ops/s, (160/160 r/w) 5.5mb/s, 481us
cpu/op, 49.7ms latency
546: 89.128: Shutting down processes
filebench> run 60
546: 738.541: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
546: 739.455: Removed any existing fileset bigfileset in 1 seconds
546: 739.455: Creating fileset bigfileset...
546: 741.387: Preallocated 786 of 1000 of fileset bigfileset in 2 seconds
546: 741.387: Creating/pre-allocating files
546: 741.387: Starting 1 filereader instances
557: 742.394: Starting 16 filereaderthread threads
546: 745.404: Running...
546: 805.944: Run took 60 seconds...
546: 805.958: Per-Operation Breakdown
closefile4 81ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 81ops/s 1.5mb/s 0.1ms/op 73us/op-cpu
openfile4 81ops/s 0.0mb/s 0.0ms/op 38us/op-cpu
closefile3 81ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 81ops/s 0.0mb/s 27.8ms/op 105us/op-cpu
appendfilerand3 81ops/s 1.3mb/s 28.6ms/op 187us/op-cpu
readfile3 81ops/s 1.4mb/s 0.1ms/op 70us/op-cpu
openfile3 81ops/s 0.0mb/s 0.0ms/op 37us/op-cpu
closefile2 81ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile2 81ops/s 0.0mb/s 29.9ms/op 124us/op-cpu
appendfilerand2 81ops/s 1.3mb/s 23.6ms/op 171us/op-cpu
createfile2 81ops/s 0.0mb/s 38.9ms/op 220us/op-cpu
deletefile1 81ops/s 0.0mb/s 47.4ms/op 109us/op-cpu
546: 805.958:
IO Summary: 63661 ops 1051.6 ops/s, (162/162 r/w) 5.4mb/s, 477us
cpu/op, 49.1ms latency
546: 805.958: Shutting down processes
filebench>
#### solaris 10 06/06 + patches, server with the same hardware specs #####
##### test # 4
# mount -o noatime /dev/dsk/c3t40d0s0 /mnt
# /opt/filebench/bin/sparcv9/filebench
filebench> load varmail
1384: 3.678: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully
loaded
1384: 3.679: Usage: set $dir=<dir>
1384: 3.679: set $filesize=<size> defaults to 16384
1384: 3.679: set $nfiles=<value> defaults to 1000
1384: 3.679: set $nthreads=<value> defaults to 16
1384: 3.679: set $meaniosize=<value> defaults to 16384
1384: 3.679: set $meandirwidth=<size> defaults to 1000000
1384: 3.679: (sets mean dir width and dir depth is calculated as log (width,
nfiles)
1384: 3.679: dirdepth therefore defaults to dir depth of 1 as in postmark
1384: 3.679: set $meandir lower to increase depth beyond 1 if desired)
1384: 3.680:
1384: 3.680: run runtime (e.g. run 60)
1384: 3.680: syntax error, token expected on line 51
filebench> set $dir=/mnt
filebench> run 60
1384: 10.872: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
1384: 11.858: Removed any existing fileset bigfileset in 1 seconds
1384: 11.859: Creating fileset bigfileset...
1384: 14.221: Preallocated 812 of 1000 of fileset bigfileset in 3 seconds
1384: 14.221: Creating/pre-allocating files
1384: 14.221: Starting 1 filereader instances
1387: 15.231: Starting 16 filereaderthread threads
1384: 18.241: Running...
1384: 78.701: Run took 60 seconds...
1384: 78.715: Per-Operation Breakdown
closefile4 500ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 500ops/s 8.4mb/s 0.1ms/op 65us/op-cpu
openfile4 500ops/s 0.0mb/s 0.0ms/op 36us/op-cpu
closefile3 500ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile3 500ops/s 0.0mb/s 9.7ms/op 169us/op-cpu
appendfilerand3 500ops/s 7.8mb/s 2.6ms/op 187us/op-cpu
readfile3 500ops/s 8.3mb/s 0.1ms/op 64us/op-cpu
openfile3 500ops/s 0.0mb/s 0.0ms/op 36us/op-cpu
closefile2 500ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile2 500ops/s 0.0mb/s 8.4ms/op 154us/op-cpu
appendfilerand2 500ops/s 7.8mb/s 1.7ms/op 168us/op-cpu
createfile2 500ops/s 0.0mb/s 4.3ms/op 298us/op-cpu
deletefile1 500ops/s 0.0mb/s 3.2ms/op 144us/op-cpu
1384: 78.715:
IO Summary: 393167 ops 6503.1 ops/s, (1000/1001 r/w) 32.4mb/s, 405us
cpu/op, 7.5ms latency
1384: 78.715: Shutting down processes
filebench> run 60
1384: 94.146: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
1384: 95.767: Removed any existing fileset bigfileset in 2 seconds
1384: 95.768: Creating fileset bigfileset...
1384: 97.972: Preallocated 786 of 1000 of fileset bigfileset in 3 seconds
1384: 97.973: Creating/pre-allocating files
1384: 97.973: Starting 1 filereader instances
1393: 98.981: Starting 16 filereaderthread threads
1384: 101.991: Running...
1384: 162.491: Run took 60 seconds...
1384: 162.505: Per-Operation Breakdown
closefile4 502ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 502ops/s 8.1mb/s 0.1ms/op 64us/op-cpu
openfile4 502ops/s 0.0mb/s 0.0ms/op 37us/op-cpu
closefile3 502ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile3 502ops/s 0.0mb/s 9.9ms/op 172us/op-cpu
appendfilerand3 502ops/s 7.8mb/s 2.7ms/op 189us/op-cpu
readfile3 502ops/s 8.2mb/s 0.1ms/op 65us/op-cpu
openfile3 502ops/s 0.0mb/s 0.0ms/op 37us/op-cpu
closefile2 502ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile2 502ops/s 0.0mb/s 8.6ms/op 156us/op-cpu
appendfilerand2 502ops/s 7.8mb/s 1.7ms/op 166us/op-cpu
createfile2 502ops/s 0.0mb/s 4.4ms/op 301us/op-cpu
deletefile1 502ops/s 0.0mb/s 3.2ms/op 148us/op-cpu
1384: 162.506:
IO Summary: 394525 ops 6521.2 ops/s, (1003/1003 r/w) 32.0mb/s, 407us
cpu/op, 7.7ms latency
1384: 162.506: Shutting down processes
filebench>
#### test 5
#### these are the same disks as used in test #2
# zpool create zfs_raid5_6disks raidz c2t16d0 c2t17d0 c2t18d0 c2t19d0 c2t20d0
c2t21d0
# zfs set atime=off zfs_raid5_6disks
# zfs create zfs_raid5_6disks/t1
#
# /opt/filebench/bin/sparcv9/filebench
filebench> load varmail
1437: 3.762: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully
loaded
1437: 3.762: Usage: set $dir=<dir>
1437: 3.762: set $filesize=<size> defaults to 16384
1437: 3.762: set $nfiles=<value> defaults to 1000
1437: 3.763: set $nthreads=<value> defaults to 16
1437: 3.763: set $meaniosize=<value> defaults to 16384
1437: 3.763: set $meandirwidth=<size> defaults to 1000000
1437: 3.763: (sets mean dir width and dir depth is calculated as log (width,
nfiles)
1437: 3.763: dirdepth therefore defaults to dir depth of 1 as in postmark
1437: 3.763: set $meandir lower to increase depth beyond 1 if desired)
1437: 3.763:
1437: 3.763: run runtime (e.g. run 60)
1437: 3.763: syntax error, token expected on line 51
filebench> set $dir=/zfs_raid5_6disks/t1
filebench> run 60
1437: 13.102: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
1437: 13.102: Creating fileset bigfileset...
1437: 20.092: Preallocated 812 of 1000 of fileset bigfileset in 7 seconds
1437: 20.092: Creating/pre-allocating files
1437: 20.092: Starting 1 filereader instances
1438: 21.095: Starting 16 filereaderthread threads
1437: 24.105: Running...
1437: 84.575: Run took 60 seconds...
1437: 84.589: Per-Operation Breakdown
closefile4 587ops/s 0.0mb/s 0.0ms/op 9us/op-cpu
readfile4 587ops/s 9.5mb/s 0.1ms/op 63us/op-cpu
openfile4 587ops/s 0.0mb/s 0.1ms/op 63us/op-cpu
closefile3 587ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 587ops/s 0.0mb/s 12.1ms/op 196us/op-cpu
appendfilerand3 587ops/s 9.2mb/s 0.1ms/op 123us/op-cpu
readfile3 587ops/s 9.5mb/s 0.1ms/op 64us/op-cpu
openfile3 587ops/s 0.0mb/s 0.1ms/op 63us/op-cpu
closefile2 587ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 587ops/s 0.0mb/s 12.6ms/op 145us/op-cpu
appendfilerand2 588ops/s 9.2mb/s 0.1ms/op 93us/op-cpu
createfile2 587ops/s 0.0mb/s 0.2ms/op 166us/op-cpu
deletefile1 587ops/s 0.0mb/s 0.1ms/op 90us/op-cpu
1437: 84.589:
IO Summary: 461708 ops 7635.5 ops/s, (1175/1175 r/w) 37.4mb/s, 330us
cpu/op, 6.4ms latency
1437: 84.589: Shutting down processes
filebench> run 60
1437: 136.114: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
1437: 136.171: Removed any existing fileset bigfileset in 1 seconds
1437: 136.172: Creating fileset bigfileset...
1437: 141.880: Preallocated 786 of 1000 of fileset bigfileset in 6 seconds
1437: 141.880: Creating/pre-allocating files
1437: 141.880: Starting 1 filereader instances
1441: 142.885: Starting 16 filereaderthread threads
1437: 145.895: Running...
1437: 206.415: Run took 60 seconds...
1437: 206.429: Per-Operation Breakdown
closefile4 582ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 582ops/s 9.4mb/s 0.1ms/op 63us/op-cpu
openfile4 582ops/s 0.0mb/s 0.1ms/op 62us/op-cpu
closefile3 582ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 582ops/s 0.0mb/s 12.2ms/op 202us/op-cpu
appendfilerand3 582ops/s 9.1mb/s 0.1ms/op 122us/op-cpu
readfile3 582ops/s 9.4mb/s 0.1ms/op 64us/op-cpu
openfile3 582ops/s 0.0mb/s 0.1ms/op 62us/op-cpu
closefile2 582ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 582ops/s 0.0mb/s 12.9ms/op 141us/op-cpu
appendfilerand2 582ops/s 9.1mb/s 0.1ms/op 91us/op-cpu
createfile2 582ops/s 0.0mb/s 0.2ms/op 157us/op-cpu
deletefile1 582ops/s 0.0mb/s 0.1ms/op 89us/op-cpu
1437: 206.429:
IO Summary: 457649 ops 7562.1 ops/s, (1163/1164 r/w) 37.0mb/s, 328us
cpu/op, 6.5ms latency
1437: 206.429: Shutting down processes
filebench>
This message posted from opensolaris.org
Luke Lonergan
2006-Aug-08 14:48 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Does snv44 have the ZFS fixes to the I/O scheduler, the ARC and the prefetch
logic?
These are great results for random I/O, I wonder how the sequential I/O looks?
Of course you''ll not get great results for sequential I/O on the 3510
:-)
- Luke
Sent from my GoodLink synchronized handheld (www.good.com)
-----Original Message-----
From: Robert Milkowski [mailto:milek at task.gda.pl]
Sent: Tuesday, August 08, 2006 10:15 AM Eastern Standard Time
To: zfs-discuss at opensolaris.org
Subject: [zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hi.
This time some RAID5/RAID-Z benchmarks.
This time I connected 3510 head unit with one link to the same server as 3510
JBODs are connected (using second link). snv_44 is used, server is v440.
I also tried changing max pending IO requests for HW raid5 lun and checked with
DTrace that larger value is really used - it is but it doesn''t change
benchmark numbers.
1. ZFS on HW RAID5 with 6 disks, atime=off
IO Summary: 444386 ops 7341.7 ops/s, (1129/1130 r/w) 36.1mb/s, 297us
cpu/op, 6.6ms latency
IO Summary: 438649 ops 7247.0 ops/s, (1115/1115 r/w) 35.5mb/s, 293us
cpu/op, 6.7ms latency
2. ZFS with software RAID-Z with 6 disks, atime=off
IO Summary: 457505 ops 7567.3 ops/s, (1164/1164 r/w) 37.2mb/s, 340us
cpu/op, 6.4ms latency
IO Summary: 457767 ops 7567.8 ops/s, (1164/1165 r/w) 36.9mb/s, 340us
cpu/op, 6.4ms latency
3. UFS on HW RAID5 with 6 disks, noatime
IO Summary: 62776 ops 1037.3 ops/s, (160/160 r/w) 5.5mb/s, 481us
cpu/op, 49.7ms latency
IO Summary: 63661 ops 1051.6 ops/s, (162/162 r/w) 5.4mb/s, 477us
cpu/op, 49.1ms latency
4. UFS on HW RAID5 with 6 disks, noatime, S10U2 + patches (the same filesystem
mounted as in 3)
IO Summary: 393167 ops 6503.1 ops/s, (1000/1001 r/w) 32.4mb/s, 405us
cpu/op, 7.5ms latency
IO Summary: 394525 ops 6521.2 ops/s, (1003/1003 r/w) 32.0mb/s, 407us
cpu/op, 7.7ms latency
5. ZFS with software RAID-Z with 6 disks, atime=off, S10U2 + patches (the same
disks as in test #2)
IO Summary: 461708 ops 7635.5 ops/s, (1175/1175 r/w) 37.4mb/s, 330us
cpu/op, 6.4ms latency
IO Summary: 457649 ops 7562.1 ops/s, (1163/1164 r/w) 37.0mb/s, 328us
cpu/op, 6.5ms latency
In this benchmark software raid-5 with ZFS (raid-z to be precise) gives a little
bit better performance than hardware raid-5. ZFS is also faster in both cases
(HW ans SW raid) than UFS on HW raid.
Something is wrong with UFS on snv_44 - the same ufs filesystem on s10U2 works
as expected.
ZFS on S10U2 in this benchmark gives the same results as on snv_44.
#### details ####
// c2t43d0 is a HW raid5 made of 6 disks
// array is configured for random IO''s
# zpool create HW_RAID5_6disks c2t43d0
#
# zpool create -f zfs_raid5_6disks raidz c3t16d0 c3t17d0 c3t18d0 c3t19d0 c3t20d0
c3t21d0
#
# zfs set atime=off zfs_raid5_6disks HW_RAID5_6disks
#
# zfs create HW_RAID5_6disks/t1
# zfs create zfs_raid5_6disks/t1
#
# /opt/filebench/bin/sparcv9/filebench
filebench> load varmail
450: 3.175: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully
loaded
450: 3.199: Usage: set $dir=<dir>
450: 3.199: set $filesize=<size> defaults to 16384
450: 3.199: set $nfiles=<value> defaults to 1000
450: 3.199: set $nthreads=<value> defaults to 16
450: 3.199: set $meaniosize=<value> defaults to 16384
450: 3.199: set $meandirwidth=<size> defaults to 1000000
450: 3.199: (sets mean dir width and dir depth is calculated as log (width,
nfiles)
450: 3.199: dirdepth therefore defaults to dir depth of 1 as in postmark
450: 3.199: set $meandir lower to increase depth beyond 1 if desired)
450: 3.199:
450: 3.199: run runtime (e.g. run 60)
450: 3.199: syntax error, token expected on line 51
filebench> set $dir=/HW_RAID5_6disks/t1
filebench> run 60
450: 13.320: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
450: 13.321: Creating fileset bigfileset...
450: 15.514: Preallocated 812 of 1000 of fileset bigfileset in 3 seconds
450: 15.515: Creating/pre-allocating files
450: 15.515: Starting 1 filereader instances
451: 16.525: Starting 16 filereaderthread threads
450: 19.535: Running...
450: 80.065: Run took 60 seconds...
450: 80.079: Per-Operation Breakdown
closefile4 565ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 565ops/s 9.2mb/s 0.1ms/op 60us/op-cpu
openfile4 565ops/s 0.0mb/s 0.1ms/op 64us/op-cpu
closefile3 565ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 565ops/s 0.0mb/s 12.9ms/op 147us/op-cpu
appendfilerand3 565ops/s 8.8mb/s 0.1ms/op 126us/op-cpu
readfile3 565ops/s 9.2mb/s 0.1ms/op 60us/op-cpu
openfile3 565ops/s 0.0mb/s 0.1ms/op 63us/op-cpu
closefile2 565ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 565ops/s 0.0mb/s 12.9ms/op 102us/op-cpu
appendfilerand2 565ops/s 8.8mb/s 0.1ms/op 92us/op-cpu
createfile2 565ops/s 0.0mb/s 0.2ms/op 154us/op-cpu
deletefile1 565ops/s 0.0mb/s 0.1ms/op 86us/op-cpu
450: 80.079:
IO Summary: 444386 ops 7341.7 ops/s, (1129/1130 r/w) 36.1mb/s, 297us
cpu/op, 6.6ms latency
450: 80.079: Shutting down processes
filebench> run 60
450: 115.945: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
450: 115.998: Removed any existing fileset bigfileset in 1 seconds
450: 115.998: Creating fileset bigfileset...
450: 118.049: Preallocated 786 of 1000 of fileset bigfileset in 3 seconds
450: 118.049: Creating/pre-allocating files
450: 118.049: Starting 1 filereader instances
454: 119.055: Starting 16 filereaderthread threads
450: 122.065: Running...
450: 182.595: Run took 60 seconds...
450: 182.608: Per-Operation Breakdown
closefile4 557ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 557ops/s 9.0mb/s 0.1ms/op 59us/op-cpu
openfile4 557ops/s 0.0mb/s 0.1ms/op 64us/op-cpu
closefile3 557ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 557ops/s 0.0mb/s 13.0ms/op 149us/op-cpu
appendfilerand3 558ops/s 8.7mb/s 0.1ms/op 120us/op-cpu
readfile3 558ops/s 9.0mb/s 0.1ms/op 59us/op-cpu
openfile3 558ops/s 0.0mb/s 0.1ms/op 64us/op-cpu
closefile2 558ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 558ops/s 0.0mb/s 13.2ms/op 100us/op-cpu
appendfilerand2 558ops/s 8.7mb/s 0.1ms/op 90us/op-cpu
createfile2 557ops/s 0.0mb/s 0.1ms/op 151us/op-cpu
deletefile1 557ops/s 0.0mb/s 0.1ms/op 86us/op-cpu
450: 182.609:
IO Summary: 438649 ops 7247.0 ops/s, (1115/1115 r/w) 35.5mb/s, 293us
cpu/op, 6.7ms latency
450: 182.609: Shutting down processes
filebench> quit
# /opt/filebench/bin/sparcv9/filebench
filebench> load varmail
458: 2.590: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully
loaded
458: 2.591: Usage: set $dir=<dir>
458: 2.591: set $filesize=<size> defaults to 16384
458: 2.591: set $nfiles=<value> defaults to 1000
458: 2.591: set $nthreads=<value> defaults to 16
458: 2.591: set $meaniosize=<value> defaults to 16384
458: 2.591: set $meandirwidth=<size> defaults to 1000000
458: 2.591: (sets mean dir width and dir depth is calculated as log (width,
nfiles)
458: 2.591: dirdepth therefore defaults to dir depth of 1 as in postmark
458: 2.592: set $meandir lower to increase depth beyond 1 if desired)
458: 2.592:
458: 2.592: run runtime (e.g. run 60)
458: 2.592: syntax error, token expected on line 51
filebench> set $dir=/zfs_raid5_6disks/t1
filebench> run 60
458: 9.251: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
458: 9.251: Creating fileset bigfileset...
458: 14.232: Preallocated 812 of 1000 of fileset bigfileset in 5 seconds
458: 14.232: Creating/pre-allocating files
458: 14.232: Starting 1 filereader instances
459: 15.235: Starting 16 filereaderthread threads
458: 18.245: Running...
458: 78.704: Run took 60 seconds...
458: 78.718: Per-Operation Breakdown
closefile4 582ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 582ops/s 9.6mb/s 0.1ms/op 62us/op-cpu
openfile4 582ops/s 0.0mb/s 0.1ms/op 67us/op-cpu
closefile3 582ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 582ops/s 0.0mb/s 12.4ms/op 206us/op-cpu
appendfilerand3 582ops/s 9.1mb/s 0.1ms/op 125us/op-cpu
readfile3 582ops/s 9.5mb/s 0.1ms/op 61us/op-cpu
openfile3 582ops/s 0.0mb/s 0.1ms/op 66us/op-cpu
closefile2 582ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 582ops/s 0.0mb/s 12.4ms/op 132us/op-cpu
appendfilerand2 582ops/s 9.1mb/s 0.1ms/op 94us/op-cpu
createfile2 582ops/s 0.0mb/s 0.2ms/op 160us/op-cpu
deletefile1 582ops/s 0.0mb/s 0.1ms/op 89us/op-cpu
458: 78.718:
IO Summary: 457505 ops 7567.3 ops/s, (1164/1164 r/w) 37.2mb/s, 340us
cpu/op, 6.4ms latency
458: 78.718: Shutting down processes
filebench> run 60
458: 98.396: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
458: 98.449: Removed any existing fileset bigfileset in 1 seconds
458: 98.449: Creating fileset bigfileset...
458: 103.837: Preallocated 786 of 1000 of fileset bigfileset in 6 seconds
458: 103.837: Creating/pre-allocating files
458: 103.837: Starting 1 filereader instances
468: 104.845: Starting 16 filereaderthread threads
458: 107.854: Running...
458: 168.345: Run took 60 seconds...
458: 168.358: Per-Operation Breakdown
closefile4 582ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 582ops/s 9.4mb/s 0.1ms/op 61us/op-cpu
openfile4 582ops/s 0.0mb/s 0.1ms/op 66us/op-cpu
closefile3 582ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 582ops/s 0.0mb/s 12.5ms/op 207us/op-cpu
appendfilerand3 582ops/s 9.1mb/s 0.1ms/op 124us/op-cpu
readfile3 582ops/s 9.4mb/s 0.1ms/op 61us/op-cpu
openfile3 582ops/s 0.0mb/s 0.1ms/op 66us/op-cpu
closefile2 582ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 582ops/s 0.0mb/s 12.3ms/op 132us/op-cpu
appendfilerand2 582ops/s 9.1mb/s 0.1ms/op 94us/op-cpu
createfile2 582ops/s 0.0mb/s 0.2ms/op 156us/op-cpu
deletefile1 582ops/s 0.0mb/s 0.1ms/op 89us/op-cpu
458: 168.359:
IO Summary: 457767 ops 7567.8 ops/s, (1164/1165 r/w) 36.9mb/s, 340us
cpu/op, 6.4ms latency
458: 168.359: Shutting down processes
filebench>
# zpool destroy HW_RAID5_6disks
# newfs -C 20 /dev/rdsk/c2t43d0s0
newfs: construct a new file system /dev/rdsk/c2t43d0s0: (y/n)? y
Warning: 68 sector(s) in last cylinder unallocated
/dev/rdsk/c2t43d0s0: 714233788 sectors in 116249 cylinders of 48 tracks, 128
sectors
348747.0MB in 7266 cyl groups (16 c/g, 48.00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
...............................................................................
..................................................................
super-block backups for last 10 cylinder groups at:
713296928, 713395360, 713493792, 713592224, 713690656, 713789088, 713887520,
713985952, 714084384, 714182816
#
# mount -o noatime /dev/dsk/c2t43d0s0 /mnt
#
# /opt/filebench/bin/sparcv9/filebench
filebench> load varmail
546: 2.573: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully
loaded
546: 2.573: Usage: set $dir=<dir>
546: 2.573: set $filesize=<size> defaults to 16384
546: 2.573: set $nfiles=<value> defaults to 1000
546: 2.574: set $nthreads=<value> defaults to 16
546: 2.574: set $meaniosize=<value> defaults to 16384
546: 2.574: set $meandirwidth=<size> defaults to 1000000
546: 2.574: (sets mean dir width and dir depth is calculated as log (width,
nfiles)
546: 2.574: dirdepth therefore defaults to dir depth of 1 as in postmark
546: 2.574: set $meandir lower to increase depth beyond 1 if desired)
546: 2.574:
546: 2.574: run runtime (e.g. run 60)
546: 2.574: syntax error, token expected on line 51
filebench> set $dir=/mnt
filebench> run 60
546: 22.095: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
546: 22.109: Creating fileset bigfileset...
546: 24.577: Preallocated 812 of 1000 of fileset bigfileset in 3 seconds
546: 24.577: Creating/pre-allocating files
546: 24.577: Starting 1 filereader instances
548: 25.584: Starting 16 filereaderthread threads
546: 28.594: Running...
546: 89.114: Run took 60 seconds...
546: 89.128: Per-Operation Breakdown
closefile4 80ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 80ops/s 1.5mb/s 0.1ms/op 76us/op-cpu
openfile4 80ops/s 0.0mb/s 0.0ms/op 39us/op-cpu
closefile3 80ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile3 80ops/s 0.0mb/s 29.2ms/op 107us/op-cpu
appendfilerand3 80ops/s 1.2mb/s 30.4ms/op 189us/op-cpu
readfile3 80ops/s 1.5mb/s 0.1ms/op 73us/op-cpu
openfile3 80ops/s 0.0mb/s 0.0ms/op 38us/op-cpu
closefile2 80ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile2 80ops/s 0.0mb/s 30.8ms/op 125us/op-cpu
appendfilerand2 80ops/s 1.2mb/s 22.6ms/op 173us/op-cpu
createfile2 80ops/s 0.0mb/s 37.2ms/op 224us/op-cpu
deletefile1 80ops/s 0.0mb/s 48.5ms/op 108us/op-cpu
546: 89.128:
IO Summary: 62776 ops 1037.3 ops/s, (160/160 r/w) 5.5mb/s, 481us
cpu/op, 49.7ms latency
546: 89.128: Shutting down processes
filebench> run 60
546: 738.541: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
546: 739.455: Removed any existing fileset bigfileset in 1 seconds
546: 739.455: Creating fileset bigfileset...
546: 741.387: Preallocated 786 of 1000 of fileset bigfileset in 2 seconds
546: 741.387: Creating/pre-allocating files
546: 741.387: Starting 1 filereader instances
557: 742.394: Starting 16 filereaderthread threads
546: 745.404: Running...
546: 805.944: Run took 60 seconds...
546: 805.958: Per-Operation Breakdown
closefile4 81ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 81ops/s 1.5mb/s 0.1ms/op 73us/op-cpu
openfile4 81ops/s 0.0mb/s 0.0ms/op 38us/op-cpu
closefile3 81ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 81ops/s 0.0mb/s 27.8ms/op 105us/op-cpu
appendfilerand3 81ops/s 1.3mb/s 28.6ms/op 187us/op-cpu
readfile3 81ops/s 1.4mb/s 0.1ms/op 70us/op-cpu
openfile3 81ops/s 0.0mb/s 0.0ms/op 37us/op-cpu
closefile2 81ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile2 81ops/s 0.0mb/s 29.9ms/op 124us/op-cpu
appendfilerand2 81ops/s 1.3mb/s 23.6ms/op 171us/op-cpu
createfile2 81ops/s 0.0mb/s 38.9ms/op 220us/op-cpu
deletefile1 81ops/s 0.0mb/s 47.4ms/op 109us/op-cpu
546: 805.958:
IO Summary: 63661 ops 1051.6 ops/s, (162/162 r/w) 5.4mb/s, 477us
cpu/op, 49.1ms latency
546: 805.958: Shutting down processes
filebench>
#### solaris 10 06/06 + patches, server with the same hardware specs #####
##### test # 4
# mount -o noatime /dev/dsk/c3t40d0s0 /mnt
# /opt/filebench/bin/sparcv9/filebench
filebench> load varmail
1384: 3.678: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully
loaded
1384: 3.679: Usage: set $dir=<dir>
1384: 3.679: set $filesize=<size> defaults to 16384
1384: 3.679: set $nfiles=<value> defaults to 1000
1384: 3.679: set $nthreads=<value> defaults to 16
1384: 3.679: set $meaniosize=<value> defaults to 16384
1384: 3.679: set $meandirwidth=<size> defaults to 1000000
1384: 3.679: (sets mean dir width and dir depth is calculated as log (width,
nfiles)
1384: 3.679: dirdepth therefore defaults to dir depth of 1 as in postmark
1384: 3.679: set $meandir lower to increase depth beyond 1 if desired)
1384: 3.680:
1384: 3.680: run runtime (e.g. run 60)
1384: 3.680: syntax error, token expected on line 51
filebench> set $dir=/mnt
filebench> run 60
1384: 10.872: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
1384: 11.858: Removed any existing fileset bigfileset in 1 seconds
1384: 11.859: Creating fileset bigfileset...
1384: 14.221: Preallocated 812 of 1000 of fileset bigfileset in 3 seconds
1384: 14.221: Creating/pre-allocating files
1384: 14.221: Starting 1 filereader instances
1387: 15.231: Starting 16 filereaderthread threads
1384: 18.241: Running...
1384: 78.701: Run took 60 seconds...
1384: 78.715: Per-Operation Breakdown
closefile4 500ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 500ops/s 8.4mb/s 0.1ms/op 65us/op-cpu
openfile4 500ops/s 0.0mb/s 0.0ms/op 36us/op-cpu
closefile3 500ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile3 500ops/s 0.0mb/s 9.7ms/op 169us/op-cpu
appendfilerand3 500ops/s 7.8mb/s 2.6ms/op 187us/op-cpu
readfile3 500ops/s 8.3mb/s 0.1ms/op 64us/op-cpu
openfile3 500ops/s 0.0mb/s 0.0ms/op 36us/op-cpu
closefile2 500ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile2 500ops/s 0.0mb/s 8.4ms/op 154us/op-cpu
appendfilerand2 500ops/s 7.8mb/s 1.7ms/op 168us/op-cpu
createfile2 500ops/s 0.0mb/s 4.3ms/op 298us/op-cpu
deletefile1 500ops/s 0.0mb/s 3.2ms/op 144us/op-cpu
1384: 78.715:
IO Summary: 393167 ops 6503.1 ops/s, (1000/1001 r/w) 32.4mb/s, 405us
cpu/op, 7.5ms latency
1384: 78.715: Shutting down processes
filebench> run 60
1384: 94.146: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
1384: 95.767: Removed any existing fileset bigfileset in 2 seconds
1384: 95.768: Creating fileset bigfileset...
1384: 97.972: Preallocated 786 of 1000 of fileset bigfileset in 3 seconds
1384: 97.973: Creating/pre-allocating files
1384: 97.973: Starting 1 filereader instances
1393: 98.981: Starting 16 filereaderthread threads
1384: 101.991: Running...
1384: 162.491: Run took 60 seconds...
1384: 162.505: Per-Operation Breakdown
closefile4 502ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 502ops/s 8.1mb/s 0.1ms/op 64us/op-cpu
openfile4 502ops/s 0.0mb/s 0.0ms/op 37us/op-cpu
closefile3 502ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile3 502ops/s 0.0mb/s 9.9ms/op 172us/op-cpu
appendfilerand3 502ops/s 7.8mb/s 2.7ms/op 189us/op-cpu
readfile3 502ops/s 8.2mb/s 0.1ms/op 65us/op-cpu
openfile3 502ops/s 0.0mb/s 0.0ms/op 37us/op-cpu
closefile2 502ops/s 0.0mb/s 0.0ms/op 12us/op-cpu
fsyncfile2 502ops/s 0.0mb/s 8.6ms/op 156us/op-cpu
appendfilerand2 502ops/s 7.8mb/s 1.7ms/op 166us/op-cpu
createfile2 502ops/s 0.0mb/s 4.4ms/op 301us/op-cpu
deletefile1 502ops/s 0.0mb/s 3.2ms/op 148us/op-cpu
1384: 162.506:
IO Summary: 394525 ops 6521.2 ops/s, (1003/1003 r/w) 32.0mb/s, 407us
cpu/op, 7.7ms latency
1384: 162.506: Shutting down processes
filebench>
#### test 5
#### these are the same disks as used in test #2
# zpool create zfs_raid5_6disks raidz c2t16d0 c2t17d0 c2t18d0 c2t19d0 c2t20d0
c2t21d0
# zfs set atime=off zfs_raid5_6disks
# zfs create zfs_raid5_6disks/t1
#
# /opt/filebench/bin/sparcv9/filebench
filebench> load varmail
1437: 3.762: Varmail Version 1.24 2005/06/22 08:08:30 personality successfully
loaded
1437: 3.762: Usage: set $dir=<dir>
1437: 3.762: set $filesize=<size> defaults to 16384
1437: 3.762: set $nfiles=<value> defaults to 1000
1437: 3.763: set $nthreads=<value> defaults to 16
1437: 3.763: set $meaniosize=<value> defaults to 16384
1437: 3.763: set $meandirwidth=<size> defaults to 1000000
1437: 3.763: (sets mean dir width and dir depth is calculated as log (width,
nfiles)
1437: 3.763: dirdepth therefore defaults to dir depth of 1 as in postmark
1437: 3.763: set $meandir lower to increase depth beyond 1 if desired)
1437: 3.763:
1437: 3.763: run runtime (e.g. run 60)
1437: 3.763: syntax error, token expected on line 51
filebench> set $dir=/zfs_raid5_6disks/t1
filebench> run 60
1437: 13.102: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
1437: 13.102: Creating fileset bigfileset...
1437: 20.092: Preallocated 812 of 1000 of fileset bigfileset in 7 seconds
1437: 20.092: Creating/pre-allocating files
1437: 20.092: Starting 1 filereader instances
1438: 21.095: Starting 16 filereaderthread threads
1437: 24.105: Running...
1437: 84.575: Run took 60 seconds...
1437: 84.589: Per-Operation Breakdown
closefile4 587ops/s 0.0mb/s 0.0ms/op 9us/op-cpu
readfile4 587ops/s 9.5mb/s 0.1ms/op 63us/op-cpu
openfile4 587ops/s 0.0mb/s 0.1ms/op 63us/op-cpu
closefile3 587ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 587ops/s 0.0mb/s 12.1ms/op 196us/op-cpu
appendfilerand3 587ops/s 9.2mb/s 0.1ms/op 123us/op-cpu
readfile3 587ops/s 9.5mb/s 0.1ms/op 64us/op-cpu
openfile3 587ops/s 0.0mb/s 0.1ms/op 63us/op-cpu
closefile2 587ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 587ops/s 0.0mb/s 12.6ms/op 145us/op-cpu
appendfilerand2 588ops/s 9.2mb/s 0.1ms/op 93us/op-cpu
createfile2 587ops/s 0.0mb/s 0.2ms/op 166us/op-cpu
deletefile1 587ops/s 0.0mb/s 0.1ms/op 90us/op-cpu
1437: 84.589:
IO Summary: 461708 ops 7635.5 ops/s, (1175/1175 r/w) 37.4mb/s, 330us
cpu/op, 6.4ms latency
1437: 84.589: Shutting down processes
filebench> run 60
1437: 136.114: Fileset bigfileset: 1000 files, avg dir = 1000000.0, avg depth =
0.5, mbytes=15
1437: 136.171: Removed any existing fileset bigfileset in 1 seconds
1437: 136.172: Creating fileset bigfileset...
1437: 141.880: Preallocated 786 of 1000 of fileset bigfileset in 6 seconds
1437: 141.880: Creating/pre-allocating files
1437: 141.880: Starting 1 filereader instances
1441: 142.885: Starting 16 filereaderthread threads
1437: 145.895: Running...
1437: 206.415: Run took 60 seconds...
1437: 206.429: Per-Operation Breakdown
closefile4 582ops/s 0.0mb/s 0.0ms/op 8us/op-cpu
readfile4 582ops/s 9.4mb/s 0.1ms/op 63us/op-cpu
openfile4 582ops/s 0.0mb/s 0.1ms/op 62us/op-cpu
closefile3 582ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile3 582ops/s 0.0mb/s 12.2ms/op 202us/op-cpu
appendfilerand3 582ops/s 9.1mb/s 0.1ms/op 122us/op-cpu
readfile3 582ops/s 9.4mb/s 0.1ms/op 64us/op-cpu
openfile3 582ops/s 0.0mb/s 0.1ms/op 62us/op-cpu
closefile2 582ops/s 0.0mb/s 0.0ms/op 11us/op-cpu
fsyncfile2 582ops/s 0.0mb/s 12.9ms/op 141us/op-cpu
appendfilerand2 582ops/s 9.1mb/s 0.1ms/op 91us/op-cpu
createfile2 582ops/s 0.0mb/s 0.2ms/op 157us/op-cpu
deletefile1 582ops/s 0.0mb/s 0.1ms/op 89us/op-cpu
1437: 206.429:
IO Summary: 457649 ops 7562.1 ops/s, (1163/1164 r/w) 37.0mb/s, 328us
cpu/op, 6.5ms latency
1437: 206.429: Shutting down processes
filebench>
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Robert Milkowski
2006-Aug-08 16:11 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hello Luke,
Tuesday, August 8, 2006, 4:48:38 PM, you wrote:
LL> Does snv44 have the ZFS fixes to the I/O scheduler, the ARC and the
prefetch logic?
LL> These are great results for random I/O, I wonder how the sequential I/O
looks?
LL> Of course you''ll not get great results for sequential I/O on the
3510 :-)
filebench/singlestreamread v440
1. UFS, noatime, HW RAID5 6 disks, S10U2
70MB/s
2. ZFS, atime=off, HW RAID5 6 disks, S10U2 (the same lun as in #1)
87MB/s
3. ZFS, atime=off, SW RAID-Z 6 disks, S10U2
130MB/s
4. ZFS, atime=off, SW RAID-Z 6 disks, snv_44
133MB/s
ps.
With software RAID-Z I got about 940ms/s :)))) well, after files were
created they were all cached and ZFS almost didn''t touch a disks :)
ok, I changed filesize to be well over memory size of the server and
above results are with that larger filesize.
filebench/singlestreamwrite v440
1. UFS, noatime, HW RAID-5 6 disks, S10U2
70MB/s
2. ZFS, atime=off, HW RAID-5 6 disks, S10U2 (the same lun as in #1)
52MB/s
3. ZFS, atime=off, SW RAID-Z 6 disks, S10U2
148MB/s
4. ZFS, atime=off, SW RAID-Z 6 disks, snv_44
147MB/s
So sequential writing in ZFS on HWR5 is actually worse than UFS.
--
Best regards,
Robert mailto:rmilkowski at task.gda.pl
http://milek.blogspot.com
Luke Lonergan
2006-Aug-08 16:18 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Robert, On 8/8/06 9:11 AM, "Robert Milkowski" <rmilkowski at task.gda.pl> wrote:> 1. UFS, noatime, HW RAID5 6 disks, S10U2 > 70MB/s > 2. ZFS, atime=off, HW RAID5 6 disks, S10U2 (the same lun as in #1) > 87MB/s > 3. ZFS, atime=off, SW RAID-Z 6 disks, S10U2 > 130MB/s > 4. ZFS, atime=off, SW RAID-Z 6 disks, snv_44 > 133MB/sWell, the UFS results are miserable, but the ZFS results aren''t good - I''d expect between 250-350MB/s from a 6-disk RAID5 with read() blocksize from 8kb to 32kb. Most of my ZFS experiments have been with RAID10, but there were some massive improvements to seq I/O with the fixes I mentioned - I''d expect that this shows that they aren''t in snv44. - Luke
Robert Milkowski
2006-Aug-08 16:32 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hello Luke, Tuesday, August 8, 2006, 6:18:39 PM, you wrote: LL> Robert, LL> On 8/8/06 9:11 AM, "Robert Milkowski" <rmilkowski at task.gda.pl> wrote:>> 1. UFS, noatime, HW RAID5 6 disks, S10U2 >> 70MB/s >> 2. ZFS, atime=off, HW RAID5 6 disks, S10U2 (the same lun as in #1) >> 87MB/s >> 3. ZFS, atime=off, SW RAID-Z 6 disks, S10U2 >> 130MB/s >> 4. ZFS, atime=off, SW RAID-Z 6 disks, snv_44 >> 133MB/sLL> Well, the UFS results are miserable, but the ZFS results aren''t good - I''d LL> expect between 250-350MB/s from a 6-disk RAID5 with read() blocksize from LL> 8kb to 32kb. Well right now I''m testing with single 200MB/s fc link so it''s upper limit in this testing. LL> Most of my ZFS experiments have been with RAID10, but there were some LL> massive improvements to seq I/O with the fixes I mentioned - I''d expect that LL> this shows that they aren''t in snv44. So where did you get those fixes? -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Mark Maybee
2006-Aug-08 16:33 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Luke Lonergan wrote:> Robert, > > On 8/8/06 9:11 AM, "Robert Milkowski" <rmilkowski at task.gda.pl> wrote: > > >>1. UFS, noatime, HW RAID5 6 disks, S10U2 >> 70MB/s >>2. ZFS, atime=off, HW RAID5 6 disks, S10U2 (the same lun as in #1) >> 87MB/s >>3. ZFS, atime=off, SW RAID-Z 6 disks, S10U2 >> 130MB/s >>4. ZFS, atime=off, SW RAID-Z 6 disks, snv_44 >> 133MB/s > > > Well, the UFS results are miserable, but the ZFS results aren''t good - I''d > expect between 250-350MB/s from a 6-disk RAID5 with read() blocksize from > 8kb to 32kb. > > Most of my ZFS experiments have been with RAID10, but there were some > massive improvements to seq I/O with the fixes I mentioned - I''d expect that > this shows that they aren''t in snv44. >Those fixes went into snv_45 -Mark
Luke Lonergan
2006-Aug-08 16:52 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Robert,> LL> Most of my ZFS experiments have been with RAID10, but there were some > LL> massive improvements to seq I/O with the fixes I mentioned - I''d expect > that > LL> this shows that they aren''t in snv44. > > So where did you get those fixes?>From the fine people who implemented them!As Mark said, apparently they''re available in snv_45 (yay!) - Luke
Doug Scott
2006-Aug-08 17:15 UTC
[zfs-discuss] Re: Re[2]: Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
> Robert, > > On 8/8/06 9:11 AM, "Robert Milkowski" > <rmilkowski at task.gda.pl> wrote: > > > 1. UFS, noatime, HW RAID5 6 disks, S10U2 > > 70MB/s > > 2. ZFS, atime=off, HW RAID5 6 disks, S10U2 (the > same lun as in #1) > > 87MB/s > > 3. ZFS, atime=off, SW RAID-Z 6 disks, S10U2 > > 130MB/s > > 4. ZFS, atime=off, SW RAID-Z 6 disks, snv_44 > > 133MB/s > > Well, the UFS results are miserable, but the ZFS > results aren''t good - I''d > expect between 250-350MB/s from a 6-disk RAID5 with > read() blocksize from > 8kb to 32kb. > > Most of my ZFS experiments have been with RAID10, but > there were some > massive improvements to seq I/O with the fixes I > mentioned - I''d expect that > this shows that they aren''t in snv44. > > - LukeI dont think there is much chance of achieving anywhere near 350MB/s. That is a hell of a lot of IO/s for 6 disks+raid(5/Z)+shared fibre. While you can always get very good results from a single disk IO, your percentage gain is always decreasing the more disks you add to the equation.>From a single 200MB/s fibre, expect some where between 160-180MB/s,at best. Doug This message posted from opensolaris.org
Matthew Ahrens
2006-Aug-08 17:25 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
On Tue, Aug 08, 2006 at 06:11:09PM +0200, Robert Milkowski wrote:> filebench/singlestreamread v440 > > 1. UFS, noatime, HW RAID5 6 disks, S10U2 > 70MB/s > > 2. ZFS, atime=off, HW RAID5 6 disks, S10U2 (the same lun as in #1) > 87MB/s > > 3. ZFS, atime=off, SW RAID-Z 6 disks, S10U2 > 130MB/s > > 4. ZFS, atime=off, SW RAID-Z 6 disks, snv_44 > 133MB/sFYI, Streaming read performance is improved considerably by Mark''s prefetch fixes which are in build 45. (However, as mentioned you will soon run into the bandwidth of a single fiber channel connection.) --matt
Robert Milkowski
2006-Aug-08 17:29 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hello Matthew, Tuesday, August 8, 2006, 7:25:17 PM, you wrote: MA> On Tue, Aug 08, 2006 at 06:11:09PM +0200, Robert Milkowski wrote:>> filebench/singlestreamread v440 >> >> 1. UFS, noatime, HW RAID5 6 disks, S10U2 >> 70MB/s >> >> 2. ZFS, atime=off, HW RAID5 6 disks, S10U2 (the same lun as in #1) >> 87MB/s >> >> 3. ZFS, atime=off, SW RAID-Z 6 disks, S10U2 >> 130MB/s >> >> 4. ZFS, atime=off, SW RAID-Z 6 disks, snv_44 >> 133MB/sMA> FYI, Streaming read performance is improved considerably by Mark''s MA> prefetch fixes which are in build 45. (However, as mentioned you will MA> soon run into the bandwidth of a single fiber channel connection.) I will probably re-test with snv_45 (waiting for SX). FC is not that big problem - if I will find enough time I will just add another FC cards. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Torrey McMahon
2006-Aug-09 02:59 UTC
[zfs-discuss] 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Robert Milkowski wrote:> Hello Richard, > > Monday, August 7, 2006, 6:54:37 PM, you wrote: > > RE> Hi Robert, thanks for the data. > RE> Please clarify one thing for me. > RE> In the case of the HW raid, was there just one LUN? Or was it 12 LUNs? > > Just one lun which was build on 3510 from 12 luns in raid-1(0). >One 12 disk Raid1 lun? One R0 lun of 12 drives?
Torrey McMahon
2006-Aug-09 03:39 UTC
[zfs-discuss] 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
I read through the entire thread, I think, and have some comments.
* There are still some "granny smith" to "Macintosh"
comparisons
going on. Different OS revs, it looks like different server types,
and I can''t tell about the HBAs, links or the LUNs being tested.
* Before you test with filebench or ZFS perform a baseline on the
LUN(s) itself with a block workload generator. This should tell
the raw performance of the device of which ZFS should be some
percentage smaller. Make sure you use lots of threads.
* Testing ...
o I''d start with configuring the 3510RAID for a sequential
workload, one large R0 raid pool across all the drives
exported as one LUN, ZFS block size at default and testing
from there. This should line the ZFS blocksize and cache
blocksize up more then the random setting.
o If you want to get interesting try slicing12 LUNs from the
single R0 raid pool in the 3510, export those to the host,
and stripe ZFS across them. (I have a feeling it will be
faster but thats just a hunch)
o If you want to get really interesting export each drive as a
single R0 LUN and stripe ZFS across the 12 LUNs (Which I
think you can do but don''t remember ever testing because,
well, it would be silly but could show some interesting
behaviors.)
* Some of the results appear to show limitations in something
besides the underlying storage but it''s hard to tell. Our
internal
tools - Which I''m dying to get out in the public - also capture
cpu load and some other stats to note bottlenecks that might come
up during testing.
That said this is all great stuff. Keep kicking the tires.
Luke Lonergan
2006-Aug-09 04:07 UTC
[zfs-discuss] Re: Re[2]: Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Doug, On 8/8/06 10:15 AM, "Doug Scott" <dougs at truemail.co.th> wrote:> I dont think there is much chance of achieving anywhere near 350MB/s. > That is a hell of a lot of IO/s for 6 disks+raid(5/Z)+shared fibre. While you > can always get very good results from a single disk IO, your percentage > gain is always decreasing the more disks you add to the equation. > >> From a single 200MB/s fibre, expect some where between 160-180MB/s, > at best.Momentarily forgot about the sucky single FC limit - I''ve become so used to calculating drive rate, which in this case would be 80MB/s per disk for modern 15K RPM FC or SCSI drives - then multiply by the 5 drives in a 6 drive RAID5/Z. We routinely get 950MB/s from 16 SATA disks on a single server with internal storage. We''re getting 2,000 MB/s on 36 disks in an X4500 with ZFS. - Luke
Robert Milkowski
2006-Aug-09 08:00 UTC
[zfs-discuss] Re: Re[2]: Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hello Luke,
Wednesday, August 9, 2006, 6:07:38 AM, you wrote:
LL> We routinely get 950MB/s from 16 SATA disks on a single server with
internal
LL> storage. We''re getting 2,000 MB/s on 36 disks in an X4500 with
ZFS.
Can you share more data? How these disks are configured, what kind of
access pattern, etc.
--
Best regards,
Robert mailto:rmilkowski at task.gda.pl
http://milek.blogspot.com
Robert Milkowski
2006-Aug-09 08:07 UTC
[zfs-discuss] 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hello Torrey, Wednesday, August 9, 2006, 4:59:08 AM, you wrote: TM> Robert Milkowski wrote:>> Hello Richard, >> >> Monday, August 7, 2006, 6:54:37 PM, you wrote: >> >> RE> Hi Robert, thanks for the data. >> RE> Please clarify one thing for me. >> RE> In the case of the HW raid, was there just one LUN? Or was it 12 LUNs? >> >> Just one lun which was build on 3510 from 12 luns in raid-1(0). >>TM> One 12 disk Raid1 lun? One R0 lun of 12 drives? If you select RAID-1 in 3510 and specify more than two disks it will actually do RAID-10 using pairs of disks. So I gave all 12 disks in a head unit and got 6 2-way mirrors which are stripped. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com
Robert Milkowski
2006-Aug-09 08:22 UTC
[zfs-discuss] 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hello Torrey,
Wednesday, August 9, 2006, 5:39:54 AM, you wrote:
TM> I read through the entire thread, I think, and have some comments.
TM> * There are still some "granny smith" to
"Macintosh" comparisons
TM> going on. Different OS revs, it looks like different server types,
TM> and I can''t tell about the HBAs, links or the LUNs being
tested.
Hmmmm.... in a first test that''s true I did use diffferent OS
revisions, but then I corrected it and the same tests were performed
on both OS''es. The server hardware are identical on both servers -
v440, 4x1,5GHz, 8GB RAM, dual-ported 2Gb FC card based on Qlogic
(1077,2312).
I also included snv_44 and S10 06/06 to see if there''re real
differences in ZFS performance in those tests.
I know I haven''t included all the details - some are more or less
obvious some not.
TM> * Before you test with filebench or ZFS perform a baseline on the
TM> LUN(s) itself with a block workload generator. This should tell
TM> the raw performance of the device of which ZFS should be some
TM> percentage smaller. Make sure you use lots of threads.
Well, that''s why I compared it to UFS. Ok, no SVM+UFS testing, but
anyway. I wanted some kind of quick answer to a really simple question
a lot of people are going to ask themselves (me included) - in case of
3510s like arrays is it better to use HW RAID with UFS? Or maybe HW
RAID with ZFS? Or maybe it''s actually better to uses only 3510s JBODs
with ZFS? There are many factors and one of them is performance.
As I want to use it as NFS server filebench/varmail is good enough
approximation.
And I''ve got an answer - ZFS should be faster right now than UFS
regardles if I will use them on HW RAID or in case of ZFS make use of
software RAID.
TM> * Testing ...
TM> o I''d start with configuring the 3510RAID for a
sequential
TM> workload, one large R0 raid pool across all the drives
TM> exported as one LUN, ZFS block size at default and testing
TM> from there. This should line the ZFS blocksize and cache
TM> blocksize up more then the random setting.
TM> o If you want to get interesting try slicing12 LUNs from the
TM> single R0 raid pool in the 3510, export those to the host,
TM> and stripe ZFS across them. (I have a feeling it will be
TM> faster but thats just a hunch)
TM> o If you want to get really interesting export each drive as a
TM> single R0 LUN and stripe ZFS across the 12 LUNs (Which I
TM> think you can do but don''t remember ever testing
because,
TM> well, it would be silly but could show some interesting
TM> behaviors.)
I know - there are more scenarios also interesting.
I would love to test them and do it in more detail with different
workloads, etc. if I only had a time.
TM> * Some of the results appear to show limitations in something
TM> besides the underlying storage but it''s hard to tell. Our
internal
TM> tools - Which I''m dying to get out in the public - also
capture
TM> cpu load and some other stats to note bottlenecks that might come
TM> up during testing.
It looks like so.
I would like to test also bigger configs - like 2-3 additional JBODs,
more HW RAID groups and generate workload concurrently on many
file systems. Then try to do it in ZFS in a one pool and in a separate
pools and see how it behaves. I''ll see about it.
TM> That said this is all great stuff. Keep kicking the tires.
:)
--
Best regards,
Robert mailto:rmilkowski at task.gda.pl
http://milek.blogspot.com
eric kustarz writes: > > >ES> Second, you may be able to get more performance from the ZFS filesystem > >ES> on the HW lun by tweaking the max pending # of reqeusts. One thing > >ES> we''ve found is that ZFS currently has a hardcoded limit of how many > >ES> outstanding requests to send to the underlying vdev (35). This works > >ES> well for most single devices, but large arrays can actually handle more, > >ES> and we end up leaving some performance on the floor. Currently the only > >ES> way to tweak this variable is through ''mdb -kw''. Try something like: > > > >Well, strange - I did try with value of 1, 60 and 256. And basically I > >get the same results from varmail tests. > > > > > > > > > > If vdev_reopen() is called then it will reset vq_max_pending to the > vdev_knob''s default value. > > So you can set the "global" vq_max_pending in vdev_knob (though this > affects all pools and all vdevs of each pool): > #mdb -kw > > vdev_knob::print > .... I think the interlace on the volume was set to 32K which means that each 128K I/O spreads to 4 disks. So the 35 vq_max_pending turns into 140 disk I/O which seems enough, as was found, to drive the 10-20 disks storage. If the interlace had been set to 1M or more then I would expect vq_max_pending to start to make a difference. What we must try to avoid is ZFS throttling itself on vq_max_pending when some disks have near 0 request in their pipe. -r
mario heimel
2006-Aug-09 13:06 UTC
[zfs-discuss] Re: Re[2]: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hi. i am very interested in ZFS compression on vs off tests maybe you can run another one with the 3510. i have seen a slightly benefit with compression on in the following test (also with high system load): S10U2 v880 8xcore 16Ggb ram (only six internal disks at this moment, i wait for the san luns -;) filebench/varmail test with default parameter and run for 60 seconds zpool create lzp raidz c0t2d0 c0t3d0 c0t4d0 c0t5d0 ZFS compression on IO Summary: 284072 ops 4688.6 ops/s, (721/722 r/w) 24.2mb/s, 544us cpu/op, 10.2ms latency IO Summary: 295985 ops 4887.7 ops/s, (752/752 r/w) 25.2mb/s, 539us cpu/op, 9.8ms latency IO Summary: 337249 ops 5568.1 ops/s, (857/857 r/w) 28.5mb/s, 529us cpu/op, 8.6ms latency IO Summary: 306231 ops 5055.1 ops/s, (778/778 r/w) 25.9mb/s, 531us cpu/op, 9.4ms latency ZFS compression off IO Summary: 284828 ops 4701.8 ops/s, (723/724 r/w) 24.0mb/s, 553us cpu/op, 10.2ms latency IO Summary: 276570 ops 4565.5 ops/s, (702/703 r/w) 23.3mb/s, 543us cpu/op, 10.5ms latency IO Summary: 276570 ops 4565.5 ops/s, (702/703 r/w) 23.3mb/s, 543us cpu/op, 10.5ms latency IO Summary: 264656 ops 4370.3 ops/s, (672/673 r/w) 22.1mb/s, 546us cpu/op, 11.1ms latency IO Summary: 264656 ops 4370.3 ops/s, (672/673 r/w) 22.1mb/s, 546us cpu/op, 11.1ms latency test under heavy avg. system load 9 compression on IO Summary: 285405 ops 4701.3 ops/s, (723/724 r/w) 22.9mb/s, 5370us cpu/op, 10.1ms latency IO Summary: 285946 ops 4719.5 ops/s, (726/726 r/w) 23.3mb/s, 5342us cpu/op, 10.0ms latency IO Summary: 307347 ops 5074.4 ops/s, (781/781 r/w) 24.6mb/s, 4964us cpu/op, 9.3ms latency IO Summary: 271030 ops 4472.6 ops/s, (688/688 r/w) 22.1mb/s, 5650us cpu/op, 10.5ms latency compression off IO Summary: 277434 ops 4579.8 ops/s, (705/705 r/w) 22.6mb/s, 5520us cpu/op, 10.4ms latency IO Summary: 259470 ops 4283.9 ops/s, (659/659 r/w) 21.2mb/s, 5913us cpu/op, 11.2ms latency IO Summary: 272979 ops 4506.2 ops/s, (693/693 r/w) 22.0mb/s, 5601us cpu/op, 10.4ms latency IO Summary: 271089 ops 4475.8 ops/s, (689/689 r/w) 22.2mb/s, 5644us cpu/op, 10.6ms latency This message posted from opensolaris.org
Roch
2006-Aug-09 15:36 UTC
[zfs-discuss] Re: Re[2]: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
mario heimel writes: > Hi. > > i am very interested in ZFS compression on vs off tests maybe you can run another one with the 3510. > > i have seen a slightly benefit with compression on in the following test (also with high system load): > S10U2 > v880 8xcore 16Ggb ram > (only six internal disks at this moment, i wait for the san luns -;) > > filebench/varmail test with default parameter and run for 60 seconds > > zpool create lzp raidz c0t2d0 c0t3d0 c0t4d0 c0t5d0 > > ZFS compression on > IO Summary: 284072 ops 4688.6 ops/s, (721/722 r/w) 24.2mb/s, 544us cpu/op, 10.2ms latency > IO Summary: 295985 ops 4887.7 ops/s, (752/752 r/w) 25.2mb/s, 539us cpu/op, 9.8ms latency > IO Summary: 337249 ops 5568.1 ops/s, (857/857 r/w) 28.5mb/s, 529us cpu/op, 8.6ms latency > IO Summary: 306231 ops 5055.1 ops/s, (778/778 r/w) 25.9mb/s, 531us cpu/op, 9.4ms latency > > > ZFS compression off > IO Summary: 284828 ops 4701.8 ops/s, (723/724 r/w) 24.0mb/s, 553us cpu/op, 10.2ms latency > IO Summary: 276570 ops 4565.5 ops/s, (702/703 r/w) 23.3mb/s, 543us cpu/op, 10.5ms latency > IO Summary: 276570 ops 4565.5 ops/s, (702/703 r/w) 23.3mb/s, 543us cpu/op, 10.5ms latency > IO Summary: 264656 ops 4370.3 ops/s, (672/673 r/w) 22.1mb/s, 546us cpu/op, 11.1ms latency > IO Summary: 264656 ops 4370.3 ops/s, (672/673 r/w) 22.1mb/s, 546us cpu/op, 11.1ms latency > > > test under heavy avg. system load 9 > > compression on > IO Summary: 285405 ops 4701.3 ops/s, (723/724 r/w) 22.9mb/s, 5370us cpu/op, 10.1ms latency > IO Summary: 285946 ops 4719.5 ops/s, (726/726 r/w) 23.3mb/s, 5342us cpu/op, 10.0ms latency > IO Summary: 307347 ops 5074.4 ops/s, (781/781 r/w) 24.6mb/s, 4964us cpu/op, 9.3ms latency > IO Summary: 271030 ops 4472.6 ops/s, (688/688 r/w) 22.1mb/s, 5650us cpu/op, 10.5ms latency > > compression off > IO Summary: 277434 ops 4579.8 ops/s, (705/705 r/w) 22.6mb/s, 5520us cpu/op, 10.4ms latency > IO Summary: 259470 ops 4283.9 ops/s, (659/659 r/w) 21.2mb/s, 5913us cpu/op, 11.2ms latency > IO Summary: 272979 ops 4506.2 ops/s, (693/693 r/w) 22.0mb/s, 5601us cpu/op, 10.4ms latency > IO Summary: 271089 ops 4475.8 ops/s, (689/689 r/w) 22.2mb/s, 5644us cpu/op, 10.6ms latency > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Beware that filebench creates zero-filled files which compress rather well. YMMV. -r
Robert Milkowski
2006-Aug-09 16:03 UTC
[zfs-discuss] Re: Re[2]: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hello Roch,
Wednesday, August 9, 2006, 5:36:39 PM, you wrote:
R> mario heimel writes:
>> Hi.
>>
>> i am very interested in ZFS compression on vs off tests maybe you can
run another one with the 3510.
>>
>> i have seen a slightly benefit with compression on in the following
test (also with high system load):
>> S10U2
>> v880 8xcore 16Ggb ram
>> (only six internal disks at this moment, i wait for the san luns -;)
>>
>> filebench/varmail test with default parameter and run for 60 seconds
>>
>> zpool create lzp raidz c0t2d0 c0t3d0 c0t4d0 c0t5d0
>>
>> ZFS compression on
>> IO Summary: 284072 ops 4688.6 ops/s, (721/722 r/w) 24.2mb/s,
544us cpu/op, 10.2ms latency
>> IO Summary: 295985 ops 4887.7 ops/s, (752/752 r/w) 25.2mb/s,
539us cpu/op, 9.8ms latency
>> IO Summary: 337249 ops 5568.1 ops/s, (857/857 r/w) 28.5mb/s,
529us cpu/op, 8.6ms latency
>> IO Summary: 306231 ops 5055.1 ops/s, (778/778 r/w) 25.9mb/s,
531us cpu/op, 9.4ms latency
>>
>>
>> ZFS compression off
>> IO Summary: 284828 ops 4701.8 ops/s, (723/724 r/w) 24.0mb/s,
553us cpu/op, 10.2ms latency
>> IO Summary: 276570 ops 4565.5 ops/s, (702/703 r/w) 23.3mb/s,
543us cpu/op, 10.5ms latency
>> IO Summary: 276570 ops 4565.5 ops/s, (702/703 r/w) 23.3mb/s,
543us cpu/op, 10.5ms latency
>> IO Summary: 264656 ops 4370.3 ops/s, (672/673 r/w) 22.1mb/s,
546us cpu/op, 11.1ms latency
>> IO Summary: 264656 ops 4370.3 ops/s, (672/673 r/w) 22.1mb/s,
546us cpu/op, 11.1ms latency
>>
>>
>> test under heavy avg. system load 9
>>
>> compression on
>> IO Summary: 285405 ops 4701.3 ops/s, (723/724 r/w) 22.9mb/s,
5370us cpu/op, 10.1ms latency
>> IO Summary: 285946 ops 4719.5 ops/s, (726/726 r/w) 23.3mb/s,
5342us cpu/op, 10.0ms latency
>> IO Summary: 307347 ops 5074.4 ops/s, (781/781 r/w) 24.6mb/s,
4964us cpu/op, 9.3ms latency
>> IO Summary: 271030 ops 4472.6 ops/s, (688/688 r/w) 22.1mb/s,
5650us cpu/op, 10.5ms latency
>>
>> compression off
>> IO Summary: 277434 ops 4579.8 ops/s, (705/705 r/w) 22.6mb/s,
5520us cpu/op, 10.4ms latency
>> IO Summary: 259470 ops 4283.9 ops/s, (659/659 r/w) 21.2mb/s,
5913us cpu/op, 11.2ms latency
>> IO Summary: 272979 ops 4506.2 ops/s, (693/693 r/w) 22.0mb/s,
5601us cpu/op, 10.4ms latency
>> IO Summary: 271089 ops 4475.8 ops/s, (689/689 r/w) 22.2mb/s,
5644us cpu/op, 10.6ms latency
>>
>>
>> This message posted from opensolaris.org
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
R> Beware that filebench creates zero-filled files which
R> compress rather well. YMMV.
To be completely honest - such blocks aren''t actually compressed by
ZFS - if a whole block is 0s and compression is on, then no
compression is actually run for that block and no data block is
written.
--
Best regards,
Robert mailto:rmilkowski at task.gda.pl
http://milek.blogspot.com
Dave Fisk
2006-Aug-09 22:29 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hi, Note that these are page cache rates and that if the application pushes harder and exposes the supporting device rates there is another world of performance to be observed. This is where ZFS gets to be a challenge as the relationship between the application level I/O and the pool level is very hard to predict. For example the COW may or may not have to read old data for a small I/O update operation, and a large portion of the pool vdev capability can be spent on this kind of overhead. Also, on read, if the pattern is random, you may or may not receive any benefit from the 32 KB to 128 KB reads on each disk of the pool vdev on behalf of a small read, say 8 KB by the application, again lots of overhead potential. I am not complaining, ZFS is great, I?m a fan, but you definitely have your work cut out for you if you want to predict its ability to scale for any given workload. Cheers, Dave (the ORtera man) This message posted from opensolaris.org
Eric Schrock
2006-Aug-09 22:35 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
On Wed, Aug 09, 2006 at 03:29:05PM -0700, Dave Fisk wrote:> > For example the COW may or may not have to read old data for a small > I/O update operation, and a large portion of the pool vdev capability > can be spent on this kind of overhead.This is what the ''recordsize'' property is for. If you have a workload that works on large files in very small sized chunks, setting the recordsize before creating the files will result in a big improvement.> Also, on read, if the pattern is random, you may or may not > receive any benefit from the 32 KB to 128 KB reads on each disk of the > pool vdev on behalf of a small read, say 8 KB by the application, > again lots of overhead potential.We''re evaluating the tradeoffs on this one. The original vdev cache has been around forever, and hasn''t really been reevaluated in the context of the latest improvements. See: 6437054 vdev_cache: wise up or die The DMU-level prefetch code had to undergo a similar overhaul, and was fixed up in build 45. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Dave C. Fisk
2006-Aug-09 23:24 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hi Eric, Thanks for the information. I am aware of the recsize option and its intended use. However, when I was exploring it to confirm the expected behavior, what I found was the opposite! The test case was build 38, Solaris 11, a 2 GB file, initially created with 1 MB SW, and a recsize of 8 KB, on a pool with two raid-z 5+1, accessed with 24 threads of 8 KB RW, for 500,000 ops or 40 seconds which ever came first. The result at the pool level was 78% of the operations were RR, all overhead. For the same test, with a 128 KB recsize (the default), the pool access was pure SW, beautiful. I ran this test 5 times. The test results with an 8 KB recsize were consistent, however ONE of the 128 KB recsize tests did have 62% RR at the pool level....this is not exactly a confidence builder for predictability. As I understand the striping logic is separate from the on disk format and can be changed in the future, I would suggest a variant of raid-z (raid-z+) that would have a variable stripe width instead of a variable stripe unit. The worst case would be 1+1, but you would generally do better than mirroring in terms the number of drives used for protection, and you could avoid dividing an 8 KB I/O over say 5, 10 or (god forbid) 47 drives. It would be much less overhead, something like 200 to 1 in one analysis (if I recall correctly), and hence much better performance. I will be happy to post ORtera summary reports for a pair of these tests if you would like to see the numbers. However, the forum would be the better place to post the reports. Regards, Dave Eric Schrock wrote:>On Wed, Aug 09, 2006 at 03:29:05PM -0700, Dave Fisk wrote: > > >>For example the COW may or may not have to read old data for a small >>I/O update operation, and a large portion of the pool vdev capability >>can be spent on this kind of overhead. >> >> > >This is what the ''recordsize'' property is for. If you have a workload >that works on large files in very small sized chunks, setting the >recordsize before creating the files will result in a big improvement. > > > >>Also, on read, if the pattern is random, you may or may not >>receive any benefit from the 32 KB to 128 KB reads on each disk of the >>pool vdev on behalf of a small read, say 8 KB by the application, >>again lots of overhead potential. >> >> > >We''re evaluating the tradeoffs on this one. The original vdev cache has >been around forever, and hasn''t really been reevaluated in the context >of the latest improvements. See: > >6437054 vdev_cache: wise up or die > >The DMU-level prefetch code had to undergo a similar overhaul, and was >fixed up in build 45. > >- Eric > >-- >Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock > > >-- Dave Fisk, ORtera Inc. Phone (562) 433-7078 DFisk at ORtera.com http://www.ORtera.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20060809/986e03bf/attachment.html>
Matthew Ahrens
2006-Aug-10 01:37 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
On Wed, Aug 09, 2006 at 04:24:55PM -0700, Dave C. Fisk wrote:> Hi Eric, > > Thanks for the information. > > I am aware of the recsize option and its intended use. However, when I > was exploring it to confirm the expected behavior, what I found was the > opposite! > > The test case was build 38, Solaris 11, a 2 GB file, initially created > with 1 MB SW, and a recsize of 8 KB, on a pool with two raid-z 5+1, > accessed with 24 threads of 8 KB RW, for 500,000 ops or 40 seconds which > ever came first. The result at the pool level was 78% of the operations > were RR, all overhead. For the same test, with a 128 KB recsize (the > default), the pool access was pure SW, beautiful.I''m not sure what RR means, but you should re-try your tests on build 42 or later. Earlier builds have bug 6424554 "full block re-writes need not read data in" which will cause a lot more data to be read than is necessary, when overwriting entire blocks. --matt
Dave C. Fisk
2006-Aug-10 02:03 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hi Matthew, In the case of the 8 KB Random Write to the 128 KB recsize filesystem the I/O were not full block re-writes, yet the expected COW Random Read (RR) at the pool level is somehow avoided. I suspect it was able to coalesce enough I/O in the 5 second transaction window to construct 128 KB blocks. This was after all, 24 threads of I/O to a 2 GB file at a rate of 140,000 IOPS. However, when using the 8 KB recsize it was not able to do this. I will check to see if it''s fixed in b45. Thanks! Dave 8 KB update to a 128 KB block), however, did not have much Random Read (RR) at the pool level. The 8 KB RW to the 8 KB recsize filesystem is where I generaly observed RR at the pool level. RR is Random Read, RW is random Write... Dave Matthew Ahrens wrote:>On Wed, Aug 09, 2006 at 04:24:55PM -0700, Dave C. Fisk wrote: > > >>Hi Eric, >> >>Thanks for the information. >> >>I am aware of the recsize option and its intended use. However, when I >>was exploring it to confirm the expected behavior, what I found was the >>opposite! >> >>The test case was build 38, Solaris 11, a 2 GB file, initially created >>with 1 MB SW, and a recsize of 8 KB, on a pool with two raid-z 5+1, >>accessed with 24 threads of 8 KB RW, for 500,000 ops or 40 seconds which >>ever came first. The result at the pool level was 78% of the operations >>were RR, all overhead. For the same test, with a 128 KB recsize (the >>default), the pool access was pure SW, beautiful. >> >> > >I''m not sure what RR means, but you should re-try your tests on build 42 >or later. Earlier builds have bug 6424554 "full block re-writes need >not read data in" which will cause a lot more data to be read than is >necessary, when overwriting entire blocks. > >--matt > > >-- Dave Fisk, ORtera Inc. Phone (562) 433-7078 DFisk at ORtera.com http://www.ORtera.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20060809/767008c5/attachment.html>
Robert Milkowski
2006-Aug-10 08:51 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Hello Dave,
Thursday, August 10, 2006, 12:29:05 AM, you wrote:
DF> Hi,
DF> Note that these are page cache rates and that if the application
DF> pushes harder and exposes the supporting device rates there is
DF> another world of performance to be observed. This is where ZFS
DF> gets to be a challenge as the relationship between the application
DF> level I/O and the pool level is very hard to predict. For example
DF> the COW may or may not have to read old data for a small I/O
DF> update operation, and a large portion of the pool vdev capability
DF> can be spent on this kind of overhead. Also, on read, if the
DF> pattern is random, you may or may not receive any benefit from the
DF> 32 KB to 128 KB reads on each disk of the pool vdev on behalf of a
DF> small read, say 8 KB by the application, again lots of overhead
DF> potential. I am not complaining, ZFS is great, I?m a fan, but you
DF> definitely have your work cut out for you if you want to predict
DF> its ability to scale for any given workload.
I know, you have valid concerns.
However in a tests I performed ZFS behaved better than UFS and it was
most important for me.
Does it mean that it will behave (performance) better than UFS in a
production? Well, I don''t know - but thanks to these tests (and some
others I haven''t posted) I''m more confident that it''s
likely it will
not behave worse. And this is only performance point of view, there
are others also important.
ps. however I''m really concerned with ZFS behavior when a pool is
almost full, there''re lot of write transactions to that pool and
server is restarted forcibly or panics. I observed that file systems
on that pool will mount in 10-30 minutes each during zfs mount -a, and
one CPU is completely consumed. It''s during system start-up so
basically
whole system boots waits for it. It means additional 1 hour downtime.
This is something really unexpected for me and unfortunately no one
was really interested in my report - I know people are busy. But still
if it hits other users when zfs pools will be already populated people
won''t be happy. For more details see my post here with subject:
"zfs
mount stuck in zil_replay".
--
Best regards,
Robert mailto:rmilkowski at task.gda.pl
http://milek.blogspot.com
The test case was build 38, Solaris 11, a 2 GB file, initially created with 1 MB SW, and a recsize of 8 KB, on a pool with two raid-z 5+1, accessed with 24 threads of 8 KB RW, for 500,000 ops or 40 seconds which ever came first. The result at the pool level was 78% of the operations were RR, all overhead. Hi David, Could this bug (now fixed) have hit you ? 6424554 full block re-writes need not read data in -r
Neil Perrin
2006-Aug-14 22:59 UTC
[zfs-discuss] Re: 3510 HW RAID vs 3510 JBOD ZFS SOFTWARE RAID
Robert Milkowski wrote:> ps. however I''m really concerned with ZFS behavior when a pool is > almost full, there''re lot of write transactions to that pool and > server is restarted forcibly or panics. I observed that file systems > on that pool will mount in 10-30 minutes each during zfs mount -a, and > one CPU is completely consumed. It''s during system start-up so basically > whole system boots waits for it. It means additional 1 hour downtime. > This is something really unexpected for me and unfortunately no one > was really interested in my report - I know people are busy. But still > if it hits other users when zfs pools will be already populated people > won''t be happy. For more details see my post here with subject: "zfs > mount stuck in zil_replay".That problem must have fallen through the cracks. Yes we are busy, but we really do care about your experiences and bugs. I have just raised a bug to cover this issue: 6460107 Extremely slow mounts after panic - searching space maps during replay Thanks for reporting this and helping make ZFS better. Neil