Hi all, I''m new here and to ZFS but I''ve been lurking for quite some time... My question is simple: which is better 8+2 or 8+1+spare? Both follow the (N+P) N={2,4,8} P={1,2} rule, but 8+2 results in a total or 10 disks, which is one disk more than 3<=num-disks<=9 rule. But 8+2 has much better MTTDL than 8+1+spare and so I''m trying to understand how bad it would really be - what doesn''t work/scale? Thanks, Kent
On Mon, Jul 09, 2007 at 11:14:58AM -0400, Kent Watsen wrote:> > Hi all, > > I''m new here and to ZFS but I''ve been lurking for quite some time... My > question is simple: which is better 8+2 or 8+1+spare? Both follow the > (N+P) N={2,4,8} P={1,2} rule, but 8+2 results in a total or 10 disks, > which is one disk more than 3<=num-disks<=9 rule. But 8+2 has much > better MTTDL than 8+1+spare and so I''m trying to understand how bad it > would really be - what doesn''t work/scale?I think that the 3<=num-disks<=9 rule only applies to RAIDZ and it was changed to 4<=num-disks<=10 for RAIDZ2, but I might be remembering wrong. -- "Perl can be fast and elegant as much as J2EE can be fast and elegant. In the hands of a skilled artisan, it can and does happen; it''s just that most of the shit out there is built by people who''d be better suited to making sure that my burger is cooked thoroughly." -- Jonathan Patschke
> which is better 8+2 or 8+1+spare?8+2 is safer for the same speed 8+2 requires alittle more math, so its slower in theory. (unlikely seen) (4+1)*2 is 2x faster, and in theory is less likely to have wasted space in transaction group (unlikely seen) (4+1)*2 is cheaper to upgrade in place because of its fewer elements so, Mr (no scale on the time access) Elling: so what''s the MTTDL between theses three?
Rob Logan wrote:> > which is better 8+2 or 8+1+spare? > > 8+2 is safer for the same speed > 8+2 requires alittle more math, so its slower in theory. (unlikely seen) > (4+1)*2 is 2x faster, and in theory is less likely to have wasted space > in transaction group (unlikely seen) > (4+1)*2 is cheaper to upgrade in place because of its fewer elements > > so, Mr (no scale on the time access) Elling: so what''s the MTTDL > between theses three?All things equal, an 8+2 raidz2 has an MTTDL about 4-5 orders of magnitude greater than 8+1+spare raidz1. Bigger MTTDL is better. The reason to recommend spares is to reduce the time when the system is vulnerable to a second failure. This time can be quite large, especially when you defer maintenance. Looking at this another way, in the 8+2 raidz2 case, the mirror is already sync''ed so you won''t have the vulnerable resync time where you could lose data due to a second failure. Another reason to recommend spares is when you have multiple top-level vdevs and want to amortize the spare cost over multiple sets. For example, if you have 19 disks then 2x 8+1 raidz + spare amortizes the cost of the spare across two raidz sets. -- richard
> I think that the 3<=num-disks<=9 rule only applies to RAIDZ and it was > changed to 4<=num-disks<=10 for RAIDZ2, but I might be remembering wrong. >Can anybody confirm that the 3<=num-disks<=9 rule only applies to RAIDZ and that 4<=num-disks<=10 applies to RAIDZ2? Thanks, Kent
> (4+1)*2 is 2x faster, and in theory is less likely to have wasted space > in transaction group (unlikely seen) > (4+1)*2 is cheaper to upgrade in place because of its fewer elementsI''m aware of these benefits but I feel that having one large lun is easier to manage - in that I can allocate the entrire array''s storage arbitrarily... I fear that if I split the array in half, I might end up with not enough space on one side and too much on the other. Otherwise, I''d do this in a heartbeat... Any advice? Thanks, Kent
On Mon, Jul 09, 2007 at 03:57:30PM -0400, Kent Watsen wrote:> > > (4+1)*2 is 2x faster, and in theory is less likely to have wasted space > > in transaction group (unlikely seen) > > (4+1)*2 is cheaper to upgrade in place because of its fewer elements > I''m aware of these benefits but I feel that having one large lun is > easier to manage - in that I can allocate the entrire array''s storage > arbitrarily... I fear that if I split the array in half, I might end up > with not enough space on one side and too much on the other. > Otherwise, I''d do this in a heartbeat...Don''t confuse vdevs with pools. If you add two 4+1 vdevs to a single pool it still appears to be "one place to put things". ;) -brian -- "Perl can be fast and elegant as much as J2EE can be fast and elegant. In the hands of a skilled artisan, it can and does happen; it''s just that most of the shit out there is built by people who''d be better suited to making sure that my burger is cooked thoroughly." -- Jonathan Patschke
Kent Watsen wrote:>> (4+1)*2 is 2x faster, and in theory is less likely to have wasted space >> in transaction group (unlikely seen) >> (4+1)*2 is cheaper to upgrade in place because of its fewer elements >> > I''m aware of these benefits but I feel that having one large lun is > easier to manage - in that I can allocate the entrire array''s storage > arbitrarily... I fear that if I split the array in half, I might end up > with not enough space on one side and too much on the other. > Otherwise, I''d do this in a heartbeat... > > Any advice? >If you''re using both halves for the same ZFS pool, then it doesn''t matter; the allocation to various filesystems takes place at that level. And if you''re considering using the entire set of drives, allocated as one hunk, for ZFS, then you might just as well divide it in half as above, if you like those two benefits more than the benefits of doing it the other way (which I agree you probably should). -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/dd-b Pics: http://dd-b.net/dd-b/SnapshotAlbum, http://dd-b.net/photography/gallery Dragaera: http://dragaera.info
Rob Logan wrote:> > > which is better 8+2 or 8+1+spare? > > 8+2 is safer for the same speed > 8+2 requires alittle more math, so its slower in theory. (unlikely seen) > (4+1)*2 is 2x faster, and in theory is less likely to have wasted space > in transaction group (unlikely seen)I keep reading that (4+1)*2 is 2x faster, but if all the data I care about is in one of the two sets, does it follow that my access to just that data is also 2x faster? - or is it more that simultaneous read/write of the entire array is (globally) 2x faster? Thanks, Kent
> Don''t confuse vdevs with pools. If you add two 4+1 vdevs to a single pool it > still appears to be "one place to put things". ;) >Newbie oversight - thanks! Kent
> Another reason to recommend spares is when you have multiple top-level > vdevs > and want to amortize the spare cost over multiple sets. For example, if > you have 19 disks then 2x 8+1 raidz + spare amortizes the cost of the > spare > across two raidz sets. > -- richardInteresting - I hadn''t realized that a spare could be used across sets Thanks! Kent
Kent Watsen wrote:> Rob Logan wrote: > >>> which is better 8+2 or 8+1+spare? >>> >> 8+2 is safer for the same speed >> 8+2 requires alittle more math, so its slower in theory. (unlikely seen) >> (4+1)*2 is 2x faster, and in theory is less likely to have wasted space >> in transaction group (unlikely seen) >> > > I keep reading that (4+1)*2 is 2x faster, but if all the data I care > about is in one of the two sets, does it follow that my access to just > that data is also 2x faster? - or is it more that simultaneous > read/write of the entire array is (globally) 2x faster? > >It is unlikely that the data you care about will be in just one of the two sets, given how ZFS spreads data around. -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/dd-b Pics: http://dd-b.net/dd-b/SnapshotAlbum, http://dd-b.net/photography/gallery Dragaera: http://dragaera.info
Your data gets striped across the two sets so what you get is a raidz stripe giving you the 2x faster. tank ---raidz ------devices ---raidz ------devices sorry for the diagram. So you got your zpool tank with raidz stripe. This message posted from opensolaris.org
John-Paul Drawneek wrote:> Your data gets striped across the two sets so what you get is a raidz stripe giving you the 2x faster. > > tank > ---raidz > ------devices > ---raidz > ------devices > > sorry for the diagram. > > So you got your zpool tank with raidz stripe.Thanks - I think you all have hammered this point home for me now - all this confusion stems from my not realizing that sets are merged into a single striped pool... ugh! ;) Kent