Brad
2010-Jan-21 02:06 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
Can anyone recommend a optimum and redundant striped configuration for a X4500? We''ll be using it for a OLTP (Oracle) database and will need best performance. Is it also true that the reads will be load-balanced across the mirrors? Is this considered a raid 1+0 configuration? zpool create -f testpool mirror c0t0d0 c1t0d0 mirror c4t0d0 c6t0d0 mirror c0t1d0 c1t1d0 mirror c4t1d0 c5t1d0 mirror c6t1d0 c7t1d0 mirror c0t2d0 c1t2d0 mirror c4t2d0 c5t2d0 mirror c6t2d0 c7t2d0 mirror c0t3d0 c1t3d0 mirror c4t3d0 c5t3d0 mirror c6t3d0 c7t3d0 mirror c0t4d0 c1t4d0 mirror c4t4d0 c6t4d0 mirror c0t5d0 c1t5d0 mirror c4t5d0 c5t5d0 mirror c6t5d0 c7t5d0 mirror c0t6d0 c1t6d0 mirror c4t6d0 c5t6d0 mirror c6t6d0 c7t6d0 mirror c0t7d0 c1t7d0 mirror c4t7d0 c5t7d0 mirror c6t7d0 c7t7d0 mirror c7t0d0 c7t4d0 Is it even possible to do a raid 0+1? zpool create -f testpool c0t0d0 c4t0d0 c0t1d0 c4t1d0 c6t1d0 c0t2d0 c4t2d0 c6t2d0 c0t3d0 c4t3d0 c6t3d0 c0t4d0 c4t4d0 c0t5d0 c4t5d0 c6t5d0 c0t6d0 c4t6d0 c6t6d0 c0t7d0 c4t7d0 c6t7d0 c7t0d0 mirror c1t0d0 c6t0d0 c1t1d0 c5t1d0 c7t1d0 c1t2d0 c5t2d0 c7t2d0 c1t3d0 c5t3d0 c7t3d0 c1t4d0 c6t4d0 c1t5d0 c5t5d0 c7t5d0 c1t6d0 c5t6d0 c7t6d0 c1t7d0 c5t7d0 c7t7d0 c7t4d0 -- This message posted from opensolaris.org
Bob Friesenhahn
2010-Jan-21 02:48 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
On Wed, 20 Jan 2010, Brad wrote:> Can anyone recommend a optimum and redundant striped configuration > for a X4500? We''ll be using it for a OLTP (Oracle) database and > will need best performance. Is it also true that the reads will be > load-balanced across the mirrors? > > Is this considered a raid 1+0 configuration?Zfs does not strictly support RAID 1+0. However, your sample command will create a pool based on mirror vdevs which is written to in a load-shared fashion (not striped). This type of pool is ideal for databases since it consumes the least of those precious IOPS. With SATA drives, you need to preserve those precious IOPS as much as possible. Zfs does not do striping across vdevs, but its load share approach will write based on (roughly) a round-robin basis, but will also prefer a less loaded vdev when under a heavy write load, or will prefer to write to an empty vdev rather than write to an almost full one. Due to zfs behavior, it is best to provision the full number of disks to start with so that the disks are evenly filled and the data is well distributed. Reads from mirror pairs use a simple load share algorithm to select the mirror side which does not attempt to strictly balance the reads. This does provide more performance than one disk, but not twice the performance.> Is it even possible to do a raid 0+1?No. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
John
2010-Jan-21 02:58 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
Have you looked at using Oracle ASM instead of or with ZFS? Recent Sun docs concerning the F5100 seem to recommend a hybrid of both. If you don''t go that route, generally you should separate redo logs from actual data so they don''t compete for I/O, since a redo switch lagging hangs the database. If you use archive logs, separate that on to yet another pool. Realistically, it takes lots of analysis with different configurations. Every workload and database is different. A decent overview of configuring JBOD-type storage for databases is here, though it doesn''t use ASM... https://www.sun.com/offers/docs/j4000_oracle_db.pdf It''s a couple years old and that might contribute to the lack of an ASM mention. -- This message posted from opensolaris.org
Brad
2010-Jan-21 03:38 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
@hortnon - ASM is not within the scope of this project. -- This message posted from opensolaris.org
Brad
2010-Jan-21 03:52 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
"Zfs does not do striping across vdevs, but its load share approach will write based on (roughly) a round-robin basis, but will also prefer a less loaded vdev when under a heavy write load, or will prefer to write to an empty vdev rather than write to an almost full one." I''m trying to visualize this...can you elaborate or give a ascii example? So with the syntax below, load sharing is implemented? zpool create testpool disk1 disk2 disk3 -- This message posted from opensolaris.org
Brad
2010-Jan-21 04:14 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
I was reading your old posts about load-shares http://opensolaris.org/jive/thread.jspa?messageID=294580񇺴 . So between raidz and load-share "striping", raidz stripes a file system block evenly across each vdev but with load sharing the file system block is written on a vdev that''s not filled up (slab??) then for the next file system block it continues filling up the 1MB slab until its full being moving on to the next one? Richard can you comment? :) -- This message posted from opensolaris.org
Richard Elling
2010-Jan-21 04:50 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
On Jan 20, 2010, at 8:14 PM, Brad wrote:> I was reading your old posts about load-shares http://opensolaris.org/jive/thread.jspa?messageID=294580񇺴 . > > So between raidz and load-share "striping", raidz stripes a file system block evenly across each vdev but with load sharing the file system block is written on a vdev that''s not filled up (slab??) then for the next file system block it continues filling up the 1MB slab until its full being moving on to the next one? > > Richard can you comment? :)That seems to be a reasonable interpretation. The nit is that the 1MB changeover is not the slab size. Slab sizes are usually much larger. In my list of things to remember for Oracle and ZFS: 1. recordsize is the biggest tuning knob 2. put redo log on a low latency device, SSD if possible 3. avoid raidz, when possible 4. prefer to give memory to the SGA rather than the ARC Roch provides some good guidelines when you have an SSD and a ZFS release which offers the logbias property here: http://blogs.sun.com/roch/entry/synchronous_write_bias_property -- richard
Edward Ned Harvey
2010-Jan-21 09:29 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
> zpool create -f testpool mirror c0t0d0 c1t0d0 mirror c4t0d0 c6t0d0 > mirror c0t1d0 c1t1d0 mirror c4t1d0 c5t1d0 mirror c6t1d0 c7t1d0 > mirror c0t2d0 c1t2d0 > mirror c4t2d0 c5t2d0 mirror c6t2d0 c7t2d0 mirror c0t3d0 c1t3d0 > mirror c4t3d0 c5t3d0 > mirror c6t3d0 c7t3d0 mirror c0t4d0 c1t4d0 mirror c4t4d0 c6t4d0 > mirror c0t5d0 c1t5d0 > mirror c4t5d0 c5t5d0 mirror c6t5d0 c7t5d0 mirror c0t6d0 c1t6d0 > mirror c4t6d0 c5t6d0 > mirror c6t6d0 c7t6d0 mirror c0t7d0 c1t7d0 mirror c4t7d0 c5t7d0 > mirror c6t7d0 c7t7d0 > mirror c7t0d0 c7t4d0This looks good. But you probably want to stick a "spare" in there, and add a SSD disk specified by "log"
Edward Ned Harvey
2010-Jan-21 09:31 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
> Zfs does not strictly support RAID 1+0. However, your sample command > will create a pool based on mirror vdevs which is written to in a > load-shared fashion (not striped). This type of pool is ideal forAlthough it''s not technically striped according to the RAID definition of striping, it does achieve the same performance result (actually better) so people will generally refer to this as striping anyway.
Carsten Aulbert
2010-Jan-21 09:34 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
On Thursday 21 January 2010 10:29:16 Edward Ned Harvey wrote:> > zpool create -f testpool mirror c0t0d0 c1t0d0 mirror c4t0d0 c6t0d0 > > mirror c0t1d0 c1t1d0 mirror c4t1d0 c5t1d0 mirror c6t1d0 c7t1d0 > > mirror c0t2d0 c1t2d0 > > mirror c4t2d0 c5t2d0 mirror c6t2d0 c7t2d0 mirror c0t3d0 c1t3d0 > > mirror c4t3d0 c5t3d0 > > mirror c6t3d0 c7t3d0 mirror c0t4d0 c1t4d0 mirror c4t4d0 c6t4d0 > > mirror c0t5d0 c1t5d0 > > mirror c4t5d0 c5t5d0 mirror c6t5d0 c7t5d0 mirror c0t6d0 c1t6d0 > > mirror c4t6d0 c5t6d0 > > mirror c6t6d0 c7t6d0 mirror c0t7d0 c1t7d0 mirror c4t7d0 c5t7d0 > > mirror c6t7d0 c7t7d0 > > mirror c7t0d0 c7t4d0 > > This looks good. But you probably want to stick a "spare" in there, and > add a SSD disk specified by "log"May I jump in here an ask how people are using SSDs relibly in a x4500? So far we had very little success with X25-E drives and a converter from 3.5 to 2.5 inches. So far two systems have shown pretty bad instabilities with that. Anyone with a success here? Cheers Carste
Edward Ned Harvey
2010-Jan-21 09:38 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
> zpool create testpool disk1 disk2 disk3In the traditional sense of RAID, this would create a concatenated data set. The size of the data set is the size of disk1 + disk2 + disk3. However, since this is ZFS, it''s not constrained to linearly assigning virtual disk blocks to physical disk blocks ... ZFS will happily write a single large file to all 3 disks simultaneously and just keep track of where all the blocks landed. As a result, you get performance which is 3x a single disk for large files (like striping) but the performance for small files has not been harmed (as it is in striping)... As an added bonus, unlike striping, you can still just add more disks to your zpool, and expand your volume on the fly. The filesystem will dynamically adjust to accommodate more space and more devices, and will intelligently optimize for performance.
Phil Harman
2010-Jan-21 10:37 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
Can ASM match ZFS for checksum and self healing? The reason I ask is that the x45x0 uses inexpensive (less reluable) SATA drives. Even the J4xxx paper you cite uses SAS for production data (only using SATA for Oracle Flash, although I gave my concerns about that too). The thing is, ZFS and the x45x0 seem made for eachother. The latter only makes sense to me with all the goodness and assurance added by the former. Phil On 21 Jan 2010, at 02:58, John <hortnon at gmail.com> wrote:> Have you looked at using Oracle ASM instead of or with ZFS? Recent > Sun docs concerning the F5100 seem to recommend a hybrid of both. > > If you don''t go that route, generally you should separate redo logs > from actual data so they don''t compete for I/O, since a redo switch > lagging hangs the database. If you use archive logs, separate that > on to yet another pool. > > Realistically, it takes lots of analysis with different > configurations. Every workload and database is different. > > A decent overview of configuring JBOD-type storage for databases is > here, though it doesn''t use ASM... > https://www.sun.com/offers/docs/j4000_oracle_db.pdf > It''s a couple years old and that might contribute to the lack of an > ASM mention. > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
John
2010-Jan-21 12:57 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
No. But, that''s where the hybrid solution comes in. ASM would be used for the database files and ZFS for the redo/archive logs and undo. Corrupt blocks in the datafiles would be repaired with data from redo during a recovery, and ZFS should give you assurance that the redo didn''t get corrupted. Sun''s docs on the F5100 point to this as the best solution for performance and recoverability/reliability. Message was edited by: hortnon -- This message posted from opensolaris.org
Bob Friesenhahn
2010-Jan-21 15:50 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
On Thu, 21 Jan 2010, Edward Ned Harvey wrote:> > Although it''s not technically striped according to the RAID definition of > striping, it does achieve the same performance result (actually better) so > people will generally refer to this as striping anyway.People will say a lot of things, but that does not make them right. At some point, using the wrong terminology becomes foolish and counterproductive. Striping and load-share seem quite different to me. The difference is immediately apparent when watching the drive activity LEDs. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Brad
2010-Jan-22 06:04 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
Did you buy the SSDs directly from Sun? I''ve heard there could possibly be firmware that''s vendor specific for the X25-E. -- This message posted from opensolaris.org
Carsten Aulbert
2010-Jan-22 06:11 UTC
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
Hi On Friday 22 January 2010 07:04:06 Brad wrote:> Did you buy the SSDs directly from Sun? I''ve heard there could possibly be > firmware that''s vendor specific for the X25-E.No. So far I''ve heard that they are not readily available as certification procedures are still underway (apart from this the 8850 firmware should be ok, but that''s just what I''ve heard). C