Hello, I''m using a Sun Fire X4500 "Thumper" and trying to get some sense of the best performance I can get from it with zfs. I''m running without mirroring or raid, and have checksumming turned off. I built the zfs with these commands: # zpool create mypool disk0 disk1 ... diskN # zfs set checksum=off mypool # zfs create mypool/testing When I run an application with 8 threads performing writes, I see this performance: 1 disk -- 42 MB/s 2 disks -- 81 MB/s 4 disks -- 147 MB/s 8 disks -- 261 MB/s 12 disks -- 347 MB/s 16 disks -- 433 MB/s 32 disks -- 687 MB/s 45 disks -- 621 MB/s I''m surprised it doesn''t scale better than this, and I''m curious to know what the best configuration is for getting the maximum write performance from the Thumper. Thanks, -- Bill. This message posted from opensolaris.org
> From: zfs-discuss-bounces at opensolaris.org > [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of > William Loewe > > I''m using a Sun Fire X4500 "Thumper" and trying to get some > sense of the best performance I can get from it with zfs. > > I''m running without mirroring or raid, and have checksumming > turned off. I built the zfs with these commands: > > # zpool create mypool disk0 disk1 ... diskN > # zfs set checksum=off mypool > # zfs create mypool/testing > > When I run an application with 8 threads performing writes, I > see this performance: > > 1 disk -- 42 MB/s > 2 disks -- 81 MB/s > 4 disks -- 147 MB/s > 8 disks -- 261 MB/s > 12 disks -- 347 MB/s > 16 disks -- 433 MB/s > 32 disks -- 687 MB/s > 45 disks -- 621 MB/sThis is more a matter of the number of vdevs at the top-level of the pool coupled with the fact that there are six (6) controllers upon which the disks attached. The good news is that adding redundancy does not slow it down. The following two example configurations demonstrate the two ends of the performance expectations you can have for the thumper. For example, a pool constructed of three (3) raidz2 vdevs, each with twelve (12) like so: bash-3.00# zpool status conf6z2pool pool: conf6z2pool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM conf6z2pool ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c0t7d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 c5t7d0 ONLINE 0 0 0 c6t7d0 ONLINE 0 0 0 c7t7d0 ONLINE 0 0 0 c8t7d0 ONLINE 0 0 0 c0t6d0 ONLINE 0 0 0 c1t6d0 ONLINE 0 0 0 c5t6d0 ONLINE 0 0 0 c6t6d0 ONLINE 0 0 0 c7t6d0 ONLINE 0 0 0 c8t6d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c0t5d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c5t5d0 ONLINE 0 0 0 c6t5d0 ONLINE 0 0 0 c7t5d0 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 0 c0t3d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c5t3d0 ONLINE 0 0 0 c6t3d0 ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 c8t3d0 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c5t2d0 ONLINE 0 0 0 c6t2d0 ONLINE 0 0 0 c7t2d0 ONLINE 0 0 0 c8t2d0 ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c5t1d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c7t1d0 ONLINE 0 0 0 c8t1d0 ONLINE 0 0 0 spares c8t0d0 AVAIL will yield the following performance for several sustained writes of block size 128k yields: (9 x dd if=/dev/zero bs=128k) bash-3.00# zpool iostat conf6z2pool 1 capacity operations bandwidth pool used avail read write read write ----------- ----- ----- ----- ----- ----- ----- conf6z2pool 9.56G 16.3T 0 3.91K 0 492M conf6z2pool 9.56G 16.3T 0 4.04K 0 509M conf6z2pool 9.56G 16.3T 0 4.05K 0 510M conf6z2pool 9.56G 16.3T 0 4.11K 0 517M and sustained read performance of several 128k streams yields: (9 x dd of=/dev/zero bs=128k) bash-3.00# zpool iostat conf6z2pool 1 capacity operations bandwidth pool used avail read write read write ----------- ----- ----- ----- ----- ----- ----- conf6z2pool 1.30T 15.0T 5.97K 0 759M 0 conf6z2pool 1.30T 15.0T 5.97K 0 759M 0 conf6z2pool 1.30T 15.0T 5.96K 0 756M 0 and sustained read/write performance of several 128k streams yields: (9 x dd if=/dev/zero bs=128k && 9 x dd of=/dev/null bs=128k) bash-3.00# zpool iostat conf6z2pool 1 capacity operations bandwidth pool used avail read write read write ----------- ----- ----- ----- ----- ----- ----- conf6z2pool 1.30T 15.0T 3.34K 2.54K 424M 320M conf6z2pool 1.30T 15.0T 2.89K 2.83K 367M 356M conf6z2pool 1.30T 15.0T 2.96K 2.80K 375M 353M conf6z2pool 1.30T 15.0T 3.50K 2.58K 445M 325M The complete opposite end of the performance spectrum will come from a pool of mirrored vdevs with the following configuraiton: bash-3.00# zpool status conf7mpool pool: conf7mpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM conf7mpool ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t7d0 ONLINE 0 0 0 c1t7d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t7d0 ONLINE 0 0 0 c6t7d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c7t7d0 ONLINE 0 0 0 c8t7d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t6d0 ONLINE 0 0 0 c1t6d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t6d0 ONLINE 0 0 0 c6t6d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c7t6d0 ONLINE 0 0 0 c8t6d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t5d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t5d0 ONLINE 0 0 0 c6t5d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c7t5d0 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t3d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t3d0 ONLINE 0 0 0 c6t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 c8t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t2d0 ONLINE 0 0 0 c6t2d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c7t2d0 ONLINE 0 0 0 c8t2d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t1d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c7t1d0 ONLINE 0 0 0 c8t1d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t4d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t4d0 ONLINE 0 0 0 c7t4d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c8t4d0 ONLINE 0 0 0 c0t0d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t0d0 ONLINE 0 0 0 c5t0d0 ONLINE 0 0 0 spares c8t0d0 AVAIL c7t0d0 AVAIL 9 x dd if=/dev/zero bs=128k (write) yields: bash-3.00# zpool iostat conf7mpool 10 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- conf7mpool 566G 9.42T 0 5.78K 0 724M conf7mpool 572G 9.41T 0 5.43K 0 679M conf7mpool 578G 9.40T 0 5.72K 0 717M conf7mpool 583G 9.40T 0 5.81K 0 727M 9 x dd of=/dev/null bs=128k (read) yields: bash-3.00# zpool iostat conf7mpool 10 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- conf7mpool 2.68T 7.29T 12.9K 0 1.59G 0 conf7mpool 2.68T 7.29T 12.9K 0 1.59G 0 conf7mpool 2.68T 7.29T 12.9K 0 1.60G 0 conf7mpool 2.68T 7.29T 12.9K 0 1.60G 0 9 x dd if=/dev/zero bs=128k (write) && 9 x dd of=/dev/null bs=128k (read): bash-3.00# zpool iostat conf7mpool 10 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- conf7mpool 1.62T 8.35T 4.81K 4.79K 610M 600M conf7mpool 1.63T 8.34T 2.77K 5.12K 351M 642M conf7mpool 1.63T 8.34T 7.91K 2.57K 1004M 318M conf7mpool 1.64T 8.33T 4.52K 3.78K 574M 471M -- paul
William Loewe wrote:> Hello, > > I''m using a Sun Fire X4500 "Thumper" and trying to get some sense of the best performance I can get from it with zfs. > > I''m running without mirroring or raid, and have checksumming turned off. I built the zfs with these commands: > > # zpool create mypool disk0 disk1 ... diskN > # zfs set checksum=off mypool > # zfs create mypool/testing > > When I run an application with 8 threads performing writes, I see this performance: > > 1 disk -- 42 MB/s > 2 disks -- 81 MB/s > 4 disks -- 147 MB/s > 8 disks -- 261 MB/s > 12 disks -- 347 MB/s > 16 disks -- 433 MB/s > 32 disks -- 687 MB/s > 45 disks -- 621 MB/s > > I''m surprised it doesn''t scale better than this, and I''m curious to know what the best configuration is for getting the maximum write performance from the Thumper.When doing testing like this, you will need to make sure you are creating enough large I/O to be interesting. Unlike many RAID-0 implementations, ZFS will allocate 128kByte blocks across vdevs on a slab basis. The default slab size is likely to be 1 MByte, so if you want to see I/O spread across 45 disks, then you''d need to generate > 45 MBytes of concurrent, write I/O. Otherwise, you will only see a subset of the disks active. This should be measurable via iostat with a small period. Once written, random reads should exhibit more stochastic balancing of the iops across disks. The disks used in the X4500 have a media bandwidth of 31-64.8 MBytes/s, so getting 42 MBytes/s to or from a single disk is not unreasonable. -- richard