I''m in the process of setting up a NAS for my company. It''s going to be based on Open Solaris and ZFS, running on a Dell R710 with two SAS 5/E HBAs. Each HBA will be connected to a 24 bay Supermicro JBOD chassis. Each chassis will have 12 drives to start out with, giving us room for expansion as needed. Ideally, I''d like to have a mirror of a raidz2 setup, but from the documentation I''ve read, it looks like I can''t do that, and that a stripe of mirrors is the only way to accomplish this. I''m interested in hearing the opinions of others about the best way to set this up. Thanks! -- This message posted from opensolaris.org
On Aug 21, 2009, at 5:46 PM, Ron Mexico <no-reply at opensolaris.org> wrote:> I''m in the process of setting up a NAS for my company. It''s going to > be based on Open Solaris and ZFS, running on a Dell R710 with two > SAS 5/E HBAs. Each HBA will be connected to a 24 bay Supermicro JBOD > chassis. Each chassis will have 12 drives to start out with, giving > us room for expansion as needed. > > Ideally, I''d like to have a mirror of a raidz2 setup, but from the > documentation I''ve read, it looks like I can''t do that, and that a > stripe of mirrors is the only way to accomplish this.Why? It uses as many drives as a RAID10, but you loose 1 more drive of usable space then RAID10 and you get less then half the performance. You might be thinking of a RAID50 which would be multiple raidz vdevs in a zpool, or striped RAID5s. If not then stick with multiple mirror vdevs in a zpool (RAID10). -Ross
On Fri, Aug 21, 2009 at 5:26 PM, Ross Walker <rswwalker at gmail.com> wrote:> On Aug 21, 2009, at 5:46 PM, Ron Mexico <no-reply at opensolaris.org> wrote: > > I''m in the process of setting up a NAS for my company. It''s going to be >> based on Open Solaris and ZFS, running on a Dell R710 with two SAS 5/E HBAs. >> Each HBA will be connected to a 24 bay Supermicro JBOD chassis. Each chassis >> will have 12 drives to start out with, giving us room for expansion as >> needed. >> >> Ideally, I''d like to have a mirror of a raidz2 setup, but from the >> documentation I''ve read, it looks like I can''t do that, and that a stripe of >> mirrors is the only way to accomplish this. >> > > Why? >Because some people are paranoid.> > It uses as many drives as a RAID10, but you loose 1 more drive of usable > space then RAID10 and you get less then half the performance. >And far more protection.> > You might be thinking of a RAID50 which would be multiple raidz vdevs in a > zpool, or striped RAID5s. > > If not then stick with multiple mirror vdevs in a zpool (RAID10). > > -RossRaid10 won''t provide as much protection. Raidz21, you can lose any 4 drives, and up to 14 if it''s the right 14. Raid10, if you lose the wrong two drives, you''re done. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090821/6309e106/attachment.html>
As you can add multiple vdevs to a pool, my suggestion would be to do several smaller raidz1 or raidz2 vdevs in the pool. With your setup - assuming 2 HBAs @ 24 drives each your setup would have yielded 20 drives usable storage (about) (assuming raidz2 with 2 spares on each HBA) and then mirrored. Maximum number of drives before failure (idea scenario): 5 (assuming the spare hasn''t caught up yet), 9 (assuming the spare had caught up and more drives failed) Suggested setup (at least as far as I''m concerned - and I am kinda new at ZFS, but not new to storage systems): 5 x raidz2 w/ 9 disks = 35 drives usable (9 disks ea x 5 raidz2 = 45 total drives - (5 raidz2 x 2 parity drives ea)) This leaves you with 3 drives that you can assign as spares (assuming 48 drives total) Maximum number of drives before failure (ideal scenario): 11 (assuming the spare hasn''t caught up yet), 14 (assuming the spare had caught up and more drives failed) Keep in mind, the parity information will take up additional space as well, but it seems you were looking for maximum redundancy (and this setup would give you that). Sorry, I just saw you were talking about 12 drives in each chassis. A similar thing applies, I would do 1 9 drive raidz2 in each chassis and add 2 total spares and then add drives 9 at a time (and 1 more spare at some point). Note: Keep in mind, I''m still kinda new to ZFS, so I may be completely wrong... (if I am, somebody, please correct me) P-Chan -- This message posted from opensolaris.org
On Aug 21, 2009, at 6:34 PM, Tim Cook <tim at cook.ms> wrote:> > > On Fri, Aug 21, 2009 at 5:26 PM, Ross Walker <rswwalker at gmail.com> > wrote: > On Aug 21, 2009, at 5:46 PM, Ron Mexico <no-reply at opensolaris.org> > wrote: > > I''m in the process of setting up a NAS for my company. It''s going to > be based on Open Solaris and ZFS, running on a Dell R710 with two > SAS 5/E HBAs. Each HBA will be connected to a 24 bay Supermicro JBOD > chassis. Each chassis will have 12 drives to start out with, giving > us room for expansion as needed. > > Ideally, I''d like to have a mirror of a raidz2 setup, but from the > documentation I''ve read, it looks like I can''t do that, and that a > stripe of mirrors is the only way to accomplish this. > > Why? > > Because some people are paranoid.If that is the case how about a separate zpool of large SATA disks and either snapshot and send/recv to it, or use AVT to replicate to it.> > > It uses as many drives as a RAID10, but you loose 1 more drive of > usable space then RAID10 and you get less then half the performance. > > And far more protection.It''s not worth the cost, the complexity is so high that it itself will be a point of failure and performance is too low for it to be any use.> > > You might be thinking of a RAID50 which would be multiple raidz > vdevs in a zpool, or striped RAID5s. > > If not then stick with multiple mirror vdevs in a zpool (RAID10). > > -Ross > > Raid10 won''t provide as much protection. Raidz21, you can lose any > 4 drives, and up to 14 if it''s the right 14. Raid10, if you lose > the wrong two drives, you''re done.Setup a side raidz2 zpool of SATA disks, snap the RAID10 and zsend it to the other pool. In the event of catastrophy you can run off the raidz2 pool temporarily until the mirror pool is fixed (and it would still perform better then the mirrored raidz2 setup!). -Ross -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090821/baac50d9/attachment.html>
Ron Mexico wrote:> I''m in the process of setting up a NAS for my company. It''s going to be based on Open Solaris and ZFS, running on a Dell R710 with two SAS 5/E HBAs. Each HBA will be connected to a 24 bay Supermicro JBOD chassis. Each chassis will have 12 drives to start out with, giving us room for expansion as needed. > > Ideally, I''d like to have a mirror of a raidz2 setup, but from the documentation I''ve read, it looks like I can''t do that, and that a stripe of mirrors is the only way to accomplish this. > > I''m interested in hearing the opinions of others about the best way to set this up. > >You''ll have to add a bit of meat to "this"! What are you resiliency, space and performance requirements? -- Ian.
On Fri, Aug 21, 2009 at 5:52 PM, Ross Walker <rswwalker at gmail.com> wrote:> On Aug 21, 2009, at 6:34 PM, Tim Cook <tim at cook.ms> wrote: > > > > On Fri, Aug 21, 2009 at 5:26 PM, Ross Walker < <rswwalker at gmail.com> > rswwalker at gmail.com> wrote: > >> On Aug 21, 2009, at 5:46 PM, Ron Mexico < <no-reply at opensolaris.org> >> no-reply at opensolaris.org> wrote: >> >> I''m in the process of setting up a NAS for my company. It''s going to be >>> based on Open Solaris and ZFS, running on a Dell R710 with two SAS 5/E HBAs. >>> Each HBA will be connected to a 24 bay Supermicro JBOD chassis. Each chassis >>> will have 12 drives to start out with, giving us room for expansion as >>> needed. >>> >>> Ideally, I''d like to have a mirror of a raidz2 setup, but from the >>> documentation I''ve read, it looks like I can''t do that, and that a stripe of >>> mirrors is the only way to accomplish this. >>> >> >> Why? >> > > Because some people are paranoid. > > > If that is the case how about a separate zpool of large SATA disks and > either snapshot and send/recv to it, or use AVT to replicate to it. >That adds a window of opportunity for failure. Potentially quite a large window.> > > > >> >> It uses as many drives as a RAID10, but you loose 1 more drive of usable >> space then RAID10 and you get less then half the performance. >> > > And far more protection. > > > > It''s not worth the cost, the complexity is so high that it itself will be a > point of failure and performance is too low for it to be any use. > >The complexity? There should be no complexity involved in a mirrored raid-z/z2 pool.> > > > >> You might be thinking of a RAID50 which would be multiple raidz vdevs in a >> zpool, or striped RAID5s. >> >> If not then stick with multiple mirror vdevs in a zpool (RAID10). >> >> -Ross > > > Raid10 won''t provide as much protection. Raidz21, you can lose any 4 > drives, and up to 14 if it''s the right 14. Raid10, if you lose the wrong > two drives, you''re done. > > > > Setup a side raidz2 zpool of SATA disks, snap the RAID10 and zsend it to > the other pool. In the event of catastrophy you can run off the raidz2 pool > temporarily until the mirror pool is fixed (and it would still perform > better then the mirrored raidz2 setup!). > >Snapshots are not a substitute for raid. That''s a completely different protection mechanism. If he wants another copy of the data, I''m sure he''ll setup a second server and do zfs send/receives. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090821/92e16179/attachment.html>
> You''ll have to add a bit of meat to "this"! > > What are you resiliency, space and performance > requirements?Resiliency is most important, followed by space and then speed. It''s primary function is to host digital assets for ad agencies and backups of other servers and workstations in the office. Since I can''t make a mirrored raidz2, I''d like the next best thing. If that means doing a zfs send from one raidz2 to the other, that''s fine. -- This message posted from opensolaris.org
Ron Mexico wrote:>> You''ll have to add a bit of meat to "this"! >> >> What are you resiliency, space and performance >> requirements? >> > > Resiliency is most important, followed by space and then speed. It''s primary function is to host digital assets for ad agencies and backups of other servers and workstations in the office. > > Since I can''t make a mirrored raidz2, I''d like the next best thing. If that means doing a zfs send from one raidz2 to the other, that''s fine. >I normally use a strip of mirrors for "live" data and a stripe of raidz2 (4+2) for "backup" data. I always assign a couple of hot spares to the pools. I also replicate important data between hosts or pools. The replication provides resiliency during a resilver. -- Ian.
On Aug 21, 2009, at 3:34 PM, Tim Cook wrote:> On Fri, Aug 21, 2009 at 5:26 PM, Ross Walker <rswwalker at gmail.com> > wrote: > On Aug 21, 2009, at 5:46 PM, Ron Mexico <no-reply at opensolaris.org> > wrote: > > I''m in the process of setting up a NAS for my company. It''s going to > be based on Open Solaris and ZFS, running on a Dell R710 with two > SAS 5/E HBAs. Each HBA will be connected to a 24 bay Supermicro JBOD > chassis. Each chassis will have 12 drives to start out with, giving > us room for expansion as needed. > > Ideally, I''d like to have a mirror of a raidz2 setup, but from the > documentation I''ve read, it looks like I can''t do that, and that a > stripe of mirrors is the only way to accomplish this. > > Why? > > Because some people are paranoid.cue the Kinks Destroyer :-)> It uses as many drives as a RAID10, but you loose 1 more drive of > usable space then RAID10 and you get less then half the performance. > > And far more protection.Yes. With raidz3 even more :-) I put together a spreadsheet a while back to help folks make this sort of decision. http://blogs.sun.com/relling/entry/sample_raidoptimizer_output I didn''t put the outputs for RAID-5+1, but RAIDoptmizer can calculate it. It won''t calculate raidz+1 because there is no such option. If there is some demand, I can put together a normal RAID (LVM or array) output of similar construction.> You might be thinking of a RAID50 which would be multiple raidz > vdevs in a zpool, or striped RAID5s. > > If not then stick with multiple mirror vdevs in a zpool (RAID10). > > -RossMy vote is with Ross. KISS wins :-) Disclaimer: I''m also a member of BAARF.> Raid10 won''t provide as much protection. Raidz21, you can lose any > 4 drives, and up to 14 if it''s the right 14. Raid10, if you lose > the wrong two drives, you''re done.One of the reasons I wrote RAIDoptimizer is to help people get a handle on the math behind this. You can see some of that orientation in my other blogs on MTTDL. But at the end of the day, you can get a pretty good ballpark by saying every level of parity adds about 3 orders of magnitude to the MTTDL. No parity is always a loss. Single parity is better. Double parity even better. Eventually, common-cause problems dominate. -- richard
On Fri, Aug 21, 2009 at 7:41 PM, Richard Elling <richard.elling at gmail.com>wrote:> On Aug 21, 2009, at 3:34 PM, Tim Cook wrote: > > On Fri, Aug 21, 2009 at 5:26 PM, Ross Walker <rswwalker at gmail.com> wrote: >> On Aug 21, 2009, at 5:46 PM, Ron Mexico <no-reply at opensolaris.org> wrote: >> >> I''m in the process of setting up a NAS for my company. It''s going to be >> based on Open Solaris and ZFS, running on a Dell R710 with two SAS 5/E HBAs. >> Each HBA will be connected to a 24 bay Supermicro JBOD chassis. Each chassis >> will have 12 drives to start out with, giving us room for expansion as >> needed. >> >> Ideally, I''d like to have a mirror of a raidz2 setup, but from the >> documentation I''ve read, it looks like I can''t do that, and that a stripe of >> mirrors is the only way to accomplish this. >> >> Why? >> >> Because some people are paranoid. >> > > cue the Kinks Destroyer :-) > > It uses as many drives as a RAID10, but you loose 1 more drive of usable >> space then RAID10 and you get less then half the performance. >> >> And far more protection. >> > > Yes. With raidz3 even more :-) > I put together a spreadsheet a while back to help folks make this sort > of decision. > http://blogs.sun.com/relling/entry/sample_raidoptimizer_output > > I didn''t put the outputs for RAID-5+1, but RAIDoptmizer can calculate it. > It won''t calculate raidz+1 because there is no such option. If there is > some > demand, I can put together a normal RAID (LVM or array) output of similar > construction.Good point as well. Completely spaced on the fact raidz3 was added not so long ago. I don''t think it''s made it to any officially supported build yet though, has it?> > > You might be thinking of a RAID50 which would be multiple raidz vdevs in a >> zpool, or striped RAID5s. >> >> If not then stick with multiple mirror vdevs in a zpool (RAID10). >> >> -Ross >> > > My vote is with Ross. KISS wins :-) > Disclaimer: I''m also a member of BAARF.My point is, RAIDZx+1 SHOULD be simple. I don''t entirely understand why it hasn''t been implemented. I can only imagine like so many other things it''s because there hasn''t been significant customer demand. Unfortunate if it''s as simple as I believe it is to implement. (No, don''t ask me to do it, I put in my time programming in college and have no desire to do it again :))> > > Raid10 won''t provide as much protection. Raidz21, you can lose any 4 >> drives, and up to 14 if it''s the right 14. Raid10, if you lose the wrong >> two drives, you''re done. >> > > One of the reasons I wrote RAIDoptimizer is to help people get a > handle on the math behind this. You can see some of that orientation > in my other blogs on MTTDL. But at the end of the day, you can get a > pretty good ballpark by saying every level of parity adds about 3 orders > of magnitude to the MTTDL. No parity is always a loss. Single parity > is better. Double parity even better. Eventually, common-cause problems > dominate. > -- richard > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090821/0d1abce2/attachment.html>
On Aug 21, 2009, at 5:55 PM, Tim Cook wrote:> On Fri, Aug 21, 2009 at 7:41 PM, Richard Elling <richard.elling at gmail.com > > wrote: > > My vote is with Ross. KISS wins :-) > Disclaimer: I''m also a member of BAARF. > > > My point is, RAIDZx+1 SHOULD be simple. I don''t entirely understand > why it hasn''t been implemented. I can only imagine like so many > other things it''s because there hasn''t been significant customer > demand. Unfortunate if it''s as simple as I believe it is to > implement. (No, don''t ask me to do it, I put in my time programming > in college and have no desire to do it again :))You can get in the same ballpark with at least two top-level raidz2 devs and copies=2. If you have three or more top-level raidz2 vdevs, then you can even do better with copies=3 ;-) Note that I do not have a model for that because it would require separate failure rate data for whole disk failures and all other non-whole disk failures. The latter is not available in data sheets. The closest I can get with published data is using the MTTDL[2] model which considers the published unrecoverable read error rate. In other words, the model would be easy, but data to feed the model is not available :-( Suffice to say, 2 top-level raidz2 vdevs of similar size with copies=2 should offer very nearly the same protection as raidz2+1. -- richard
On 21-Aug-09, at 21:04 , Richard Elling wrote:>> My point is, RAIDZx+1 SHOULD be simple. I don''t entirely >> understand why it hasn''t been implemented. I can only imagine like >> so many other things it''s because there hasn''t been significant >> customer demand. Unfortunate if it''s as simple as I believe it is >> to implement. (No, don''t ask me to do it, I put in my time >> programming in college and have no desire to do it again :)) > > You can get in the same ballpark with at least two top-level raidz2 > devs and > copies=2. If you have three or more top-level raidz2 vdevs, then > you can even > do better with copies=3 ;-)Maybe this is noted somewhere, but I did not realize that "copies" invoked logic that distributed the copies among vdevs? Can you please provide some pointers about this? Thanks, A. -- Adam Sherman CTO, Versature Corp. Tel: +1.877.498.3772 x113
On Fri, Aug 21, 2009 at 8:04 PM, Richard Elling <richard.elling at gmail.com>wrote:> On Aug 21, 2009, at 5:55 PM, Tim Cook wrote: > >> On Fri, Aug 21, 2009 at 7:41 PM, Richard Elling <richard.elling at gmail.com> >> wrote: >> >> My vote is with Ross. KISS wins :-) >> Disclaimer: I''m also a member of BAARF. >> >> >> My point is, RAIDZx+1 SHOULD be simple. I don''t entirely understand why >> it hasn''t been implemented. I can only imagine like so many other things >> it''s because there hasn''t been significant customer demand. Unfortunate if >> it''s as simple as I believe it is to implement. (No, don''t ask me to do it, >> I put in my time programming in college and have no desire to do it again >> :)) >> > > You can get in the same ballpark with at least two top-level raidz2 devs > and > copies=2. If you have three or more top-level raidz2 vdevs, then you can > even > do better with copies=3 ;-) > > Note that I do not have a model for that because it would require separate > failure rate data for whole disk failures and all other non-whole disk > failures. > The latter is not available in data sheets. The closest I can get with > published > data is using the MTTDL[2] model which considers the published > unrecoverable > read error rate. In other words, the model would be easy, but data to feed > the > model is not available :-( Suffice to say, 2 top-level raidz2 vdevs of > similar size > with copies=2 should offer very nearly the same protection as raidz2+1. > -- richard >You sure about that? Say I have a sas controller shit the bed (pardon the french), and take one of the JBOD''s out entirely. Even with copies=2, isn''t the entire pool going tits up and offline when it loses an entire vdev? It would seem to me copies=2 is only applicable when you have both an entire disk loss, and corrupt data on the "good disks". But feel free to enlighten :) That scenario seems far less likely than having a controller go bad, but that''s with my anecdotal personal experiences. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090821/c77020ae/attachment.html>
On Aug 21, 2009, at 6:09 PM, Adam Sherman wrote:> On 21-Aug-09, at 21:04 , Richard Elling wrote: >>> My point is, RAIDZx+1 SHOULD be simple. I don''t entirely >>> understand why it hasn''t been implemented. I can only imagine >>> like so many other things it''s because there hasn''t been >>> significant customer demand. Unfortunate if it''s as simple as I >>> believe it is to implement. (No, don''t ask me to do it, I put in >>> my time programming in college and have no desire to do it again :)) >> >> You can get in the same ballpark with at least two top-level raidz2 >> devs and >> copies=2. If you have three or more top-level raidz2 vdevs, then >> you can even >> do better with copies=3 ;-) > > > Maybe this is noted somewhere, but I did not realize that "copies" > invoked logic that distributed the copies among vdevs? Can you > please provide some pointers about this?It is hard to describe in words, so I made some pictures :-) http://blogs.sun.com/relling/entry/zfs_copies_and_data_protection -- richard
comment far below... On Aug 21, 2009, at 6:17 PM, Tim Cook wrote:> On Fri, Aug 21, 2009 at 8:04 PM, Richard Elling <richard.elling at gmail.com > > wrote: > On Aug 21, 2009, at 5:55 PM, Tim Cook wrote: > On Fri, Aug 21, 2009 at 7:41 PM, Richard Elling <richard.elling at gmail.com > > wrote: > > My vote is with Ross. KISS wins :-) > Disclaimer: I''m also a member of BAARF. > > > My point is, RAIDZx+1 SHOULD be simple. I don''t entirely understand > why it hasn''t been implemented. I can only imagine like so many > other things it''s because there hasn''t been significant customer > demand. Unfortunate if it''s as simple as I believe it is to > implement. (No, don''t ask me to do it, I put in my time programming > in college and have no desire to do it again :)) > > You can get in the same ballpark with at least two top-level raidz2 > devs and > copies=2. If you have three or more top-level raidz2 vdevs, then > you can even > do better with copies=3 ;-) > > Note that I do not have a model for that because it would require > separate > failure rate data for whole disk failures and all other non-whole > disk failures. > The latter is not available in data sheets. The closest I can get > with published > data is using the MTTDL[2] model which considers the published > unrecoverable > read error rate. In other words, the model would be easy, but data > to feed the > model is not available :-( Suffice to say, 2 top-level raidz2 vdevs > of similar size > with copies=2 should offer very nearly the same protection as > raidz2+1. > -- richard > > > You sure about that? Say I have a sas controller shit the bed > (pardon the french), and take one of the JBOD''s out entirely. Even > with copies=2, isn''t the entire pool going tits up and offline when > it loses an entire vdev?Yes. But you need to understand that the probability of a SAS controller failing is much, much smaller than a disk. So in order to properly model the system, you can''t treat them as having the same failure rate (the difference is an order of magnitude for HDDs). Depending on the repair policy, the probability of losing a SAS controller is expected to be less than the probability of losing 3 disks in a raidz2. Since SAS is relatively easy to make redundant, a really paranoid person would have two SAS controllers and the probability of losing two highly-reliable SAS controllers at the same time is way small :-)> It would seem to me copies=2 is only applicable when you have both > an entire disk loss, and corrupt data on the "good disks". But feel > free to enlighten :) That scenario seems far less likely than > having a controller go bad, but that''s with my anecdotal personal > experiences.As the Kinks sing, "paranoia will destroy ya!" :-) -- richard
On Fri, 21 Aug 2009, Tim Cook wrote:> > Raid10 won''t provide as much protection. Raidz21, you can lose any 4 > drives, and up to 14 if it''s the right 14. Raid10, if you lose the wrong > two drives, you''re done.On the flip side, the chance of loosing a second drive during the recovery interval is much less with mirroring since only one drive needs to be read in order to support the resilver and there is far less mechanical action and I/Os involved. If you make sure that you have a spare drive available to the pool, then the spare drive can be resilvered and take over while you sleep, minimizing the risk. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Fri, 21 Aug 2009, Ron Mexico wrote:> > Since I can''t make a mirrored raidz2, I''d like the next best thing. > If that means doing a zfs send from one raidz2 to the other, that''s > fine.Without using heirarchical servers (e.g. volumes from a zfs pool exported via iSCSI to be part of another zfs storage pool) you can''t do mirrored raidz2 but you can easily do triple mirroring. If disk space is not a concern, then it is difficult to beat the reliability of a triple mirror. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Fri, 21 Aug 2009, Richard Elling wrote:> magnitude for HDDs). Depending on the repair policy, the probability > of losing a SAS controller is expected to be less than the > probability of losing 3 disks in a raidz2. Since SAS is relatively > easy to make redundant, a really paranoid person would have two SAS > controllers and the probability of losing two highly-reliable SAS > controllers at the same time is way small :-)This is a reason to prefer mirroring, with devices in the mirror carefully split across controllers. This approach makes failures easier to understand and helps avoid propagation of errors. Complex system designs lead to complex problems. Some of the world''s largest and most successful 5-9s class systems are built using simple duplex redundancy. It is possible to build raidz and raidz2 systems so that their devices are accessed via unique paths, but such systems rapidly become quite large and expensive.> As the Kinks sing, "paranoia will destroy ya!" :-)There''s a time device inside of me, I''m a self-destructin disk! When anything goes wrong in a system, the human factor becomes quite large. It dramatically increases the probability that human error (the primary cause of data loss) will occur. The system should be designed to accommodate the attendant humans. Solaris is still much too complicated for people to understand in times of crisis. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Fri, 21 Aug 2009 18:04:49 -0700, Richard Elling <richard.elling at gmail.com> wrote:> You can get in the same ballpark with at least two top-level > raidz2 devs and copies=2. If you have three or more > top-level raidz2 vdevs, then you can even do better > with copies=3 ;-)Please note that copies=3 will be obsoleted soon, because the space for the pointer to the third instance of the data block was needed for some other purpose (I forgot which). -- ( Kees Nuyt ) c[_]
On Aug 22, 2009, at 1:02 PM, Kees Nuyt wrote:> On Fri, 21 Aug 2009 18:04:49 -0700, Richard Elling > <richard.elling at gmail.com> wrote: > >> You can get in the same ballpark with at least two top-level >> raidz2 devs and copies=2. If you have three or more >> top-level raidz2 vdevs, then you can even do better >> with copies=3 ;-) > > Please note that copies=3 will be obsoleted soon, because > the space for the pointer to the third instance of the data > block was needed for some other purpose (I forgot which).The limit will be copies=2 for ZFS encrypted datasets. By default, file systems will not be encrypted. -- richard
> Suffice to say, 2 top-level raidz2 vdevs of similar size with copies=2 > should offer very nearly the same protection as raidz2+1. > -- richardThis looks like the way to go. Thanks for your input. It''s much appreciated! -- This message posted from opensolaris.org