The only supported controller I''ve found is the Areca ARC-1280ML. I want to put it in one of the 24-disk Supermicro chassis that Silicon Mechanics builds. Has anyone had success with this card and this kind of chassis/number of drives? cheers, Blake This message posted from opensolaris.org
On Mon, 14 Apr 2008, Blake Irvin wrote:> The only supported controller I''ve found is the Areca ARC-1280ML. > I want to put it in one of the 24-disk Supermicro chassis that > Silicon Mechanics builds.For obvious reasons (redundancy and throughput), it makes more sense to purchase two 12 port cards. I see that there is an option to populate more cache RAM. I would be interested to know what actual throughput that one card is capable of. The CDW site says "300MB/s". Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Tue, Apr 15, 2008 at 1:25 AM, Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:> For obvious reasons (redundancy and throughput), it makes more sense > to purchase two 12 port cards. I see that there is an option to > populate more cache RAM.More RAM always helps ;)> I would be interested to know what actual throughput that one card is > capable of. The CDW site says "300MB/s".It looks like it''s more like 600 MB/s. See this thread on Hardforums: http://hardforum.com/showpost.php?p=1032222973&postcount=111 for more details. He''s got a 1280ML and 24 disks with 22 in raid 6 and 2 hot spares, getting up to 588 MB/s sustained block read according to HDTach. Two other people on that forum (handles Ockie and odditory) have those cards, and are similarly impressed, but I don''t see benchmarks from them. In any case, I agree with your suggestion: get two 12-port cards, or three 8-port cards, and save money and get better performance anyways. Will
On Mon, Apr 14, 2008 at 11:34 PM, Will Murnane <will.murnane at gmail.com> wrote:> On Tue, Apr 15, 2008 at 1:25 AM, Bob Friesenhahn > <bfriesen at simple.dallas.tx.us> wrote: > > For obvious reasons (redundancy and throughput), it makes more sense > > to purchase two 12 port cards. I see that there is an option to > > populate more cache RAM. > More RAM always helps ;) > > > I would be interested to know what actual throughput that one card is > > capable of. The CDW site says "300MB/s". > It looks like it''s more like 600 MB/s. See this thread on Hardforums: > http://hardforum.com/showpost.php?p=1032222973&postcount=111 > for more details. He''s got a 1280ML and 24 disks with 22 in raid 6 > and 2 hot spares, getting up to 588 MB/s sustained block read > according to HDTach. > > Two other people on that forum (handles Ockie and odditory) have those > cards, and are similarly impressed, but I don''t see benchmarks from > them. In any case, I agree with your suggestion: get two 12-port > cards, or three 8-port cards, and save money and get better > performance anyways. > > Will > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >I''m sure you''re already aware, but if not, 22 drives in a raid-6 is absolutely SUICIDE when using SATA disks. 12 disks is the upper end of what you want even with raid-6. The odds of you losing data in a 22 disk raid-6 is far too great to be worth it if you care about your data. /rant -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080414/9e1b689b/attachment.html>
On Mon, Apr 14, 2008 at 9:41 PM, Tim <tim at tcsac.net> wrote:> I''m sure you''re already aware, but if not, 22 drives in a raid-6 is > absolutely SUICIDE when using SATA disks. 12 disks is the upper end of what > you want even with raid-6. The odds of you losing data in a 22 disk raid-6 > is far too great to be worth it if you care about your data. /rantFunny, I was thinking the same thing! I think NetApp says to use 14 disk stripes with their double parity, arguing that double parity across 14 disks is better protection than two single parity stripes of 7. The other thought that I had if ZFS would have worked for him, but it sounds like he''s a Windows guy. ... and to threadjack, has there been any talk of a Windows ZFS driver? -B -- Brandon High bhigh at freaks.com "The good is the enemy of the best." - Nietzsche
I have 16 disks in RAID 5 and I''m not worried.>I''m sure you''re already aware, but if not, 22 drives in a raid-6 is >absolutely SUICIDE when using SATA disks. 12 disks is the upper end of what >you want even with raid-6. The odds of you losing data in a 22 disk raid-6 >is far too great to be worth it if you care about your data. /rant-- Maurice Volaski, mvolaski at aecom.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University
On Tue, Apr 15, 2008 at 10:09 AM, Maurice Volaski <mvolaski at aecom.yu.edu> wrote:> I have 16 disks in RAID 5 and I''m not worried. > > >I''m sure you''re already aware, but if not, 22 drives in a raid-6 is > >absolutely SUICIDE when using SATA disks. 12 disks is the upper end of > what > >you want even with raid-6. The odds of you losing data in a 22 disk > raid-6 > >is far too great to be worth it if you care about your data. /rant > > -- > > Maurice Volaski, mvolaski at aecom.yu.edu > Computing Support, Rose F. Kennedy Center > Albert Einstein College of Medicine of Yeshiva University > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >You could also be driving your car down the freeway at 100mph drunk, high, and without a seatbelt on and not be worried. The odds will still be horribly against you. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080415/95cb8ec6/attachment.html>
On Apr 15, 2008, at 10:58 AM, Tim wrote:> > > On Tue, Apr 15, 2008 at 10:09 AM, Maurice Volaski > <mvolaski at aecom.yu.edu> wrote: > I have 16 disks in RAID 5 and I''m not worried. > > >I''m sure you''re already aware, but if not, 22 drives in a raid-6 is > >absolutely SUICIDE when using SATA disks. 12 disks is the upper > end of what > >you want even with raid-6. The odds of you losing data in a 22 > disk raid-6 > >is far too great to be worth it if you care about your data. /rant > > > You could also be driving your car down the freeway at 100mph > drunk, high, and without a seatbelt on and not be worried. The > odds will still be horribly against you. >Perhaps providing the computations rather than the conclusions would be more persuasive on a technical list ;> -- Keith H. Bierman khbkhb at gmail.com | AIM kbiermank 5430 Nassau Circle East | Cherry Hills Village, CO 80113 | 303-997-2749 <speaking for myself*> Copyright 2008 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080415/b0ef8456/attachment.html>
Right, a nice depiction of the failure modes involved and their probabilities based on typical published mtbf of components and other arguments/caveats, please? Does anyone have the cycles to actually illustrate this or have urls to such studies? On Tue, Apr 15, 2008 at 1:03 PM, Keith Bierman <khbkhb at gmail.com> wrote:> > > On Apr 15, 2008, at 10:58 AM, Tim wrote: > > > > On Tue, Apr 15, 2008 at 10:09 AM, Maurice Volaski <mvolaski at aecom.yu.edu> > wrote: > > I have 16 disks in RAID 5 and I''m not worried. > > > > > > >I''m sure you''re already aware, but if not, 22 drives in a raid-6 is > > >absolutely SUICIDE when using SATA disks. 12 disks is the upper end of > what > > >you want even with raid-6. The odds of you losing data in a 22 disk > raid-6 > > >is far too great to be worth it if you care about your data. /rant > > > > > You could also be driving your car down the freeway at 100mph drunk, high, > and without a seatbelt on and not be worried. The odds will still be > horribly against you. > > > Perhaps providing the computations rather than the conclusions would be more > persuasive on a technical list ;> > > -- > Keith H. Bierman khbkhb at gmail.com | AIM kbiermank > 5430 Nassau Circle East | > Cherry Hills Village, CO 80113 | 303-997-2749 > <speaking for myself*> Copyright 2008 > > > > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >
On Tue, 15 Apr 2008, Keith Bierman wrote:> > Perhaps providing the computations rather than the conclusions would be more > persuasive on a technical list ;>No doubt. The computations depend considerably on the size of the disk drives involved. The odds of experiencing media failure on a single 1TB SATA disk are quite high. Consider that this media failure may occur while attempting to recover from a failed disk. There have been some good articles on this in USENIX Login magazine. ZFS raidz1 and raidz2 are NOT directly equivalent to RAID5 and RAID6 so the failure statistics would be different. Regardless, single disk failure in a raidz1 substantially increases the risk that something won''t be recoverable if there is a media failure while rebuilding. Since ZFS duplicates its own metadata blocks, it is most likely that some user data would be lost but the pool would otherwise recover. If a second disk drive completely fails, then you are toast with raidz1. RAID5 and RAID6 rebuild the entire disk while raidz1 and raidz2 only rebuild existing data blocks so raidz1 and raidz2 are less likely to experience media failure if the pool is not full. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Tue, Apr 15, 2008 at 12:03 PM, Keith Bierman <khbkhb at gmail.com> wrote:> > On Apr 15, 2008, at 10:58 AM, Tim wrote: > > > > On Tue, Apr 15, 2008 at 10:09 AM, Maurice Volaski <mvolaski at aecom.yu.edu> > wrote: > > > I have 16 disks in RAID 5 and I''m not worried. > > > > >I''m sure you''re already aware, but if not, 22 drives in a raid-6 is > > >absolutely SUICIDE when using SATA disks. 12 disks is the upper end of > > what > > >you want even with raid-6. The odds of you losing data in a 22 disk > > raid-6 > > >is far too great to be worth it if you care about your data. /rant > > > > > You could also be driving your car down the freeway at 100mph drunk, high, > and without a seatbelt on and not be worried. The odds will still be > horribly against you. > > > Perhaps providing the computations rather than the conclusions would be > more persuasive on a technical list ;> > > -- > Keith H. Bierman khbkhb at gmail.com | AIM kbiermank > 5430 Nassau Circle East | > Cherry Hills Village, CO 80113 | 303-997-2749 > <speaking for myself*> Copyright 2008 > > > > >What fun is that? ;) http://blogs.netapp.com/dave/2006/03/expect_double_d.html There''s a layman''s explanation. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080415/2f5a72f2/attachment.html>
>Perhaps providing the computations rather than the conclusions would >be more persuasive on a technical list ;>2 16-disk SATA arrays in RAID 5 2 16-disk SATA arrays in RAID 6 1 9-disk SATA array in RAID 5. 4 drive failures over 5 years. Of course, YMMV, especially if you drive drunk :-) -- Maurice Volaski, mvolaski at aecom.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University
On Tue, 15 Apr 2008, Maurice Volaski wrote:> 4 drive failures over 5 years. Of course, YMMV, especially if you > drive drunk :-)Note that there is a difference between drive failure and media data loss. In a system which has been running fine for a while, the chance of a second drive failing during rebuild may be low, but the chance of block-level media failure is not. However, computers do not normally run in a vaccum. Many failures are caused by something like a power glitch, temperature cycle, or the flap of a butterfly''s wings. Unless your environment is completely stable and the devices are not dependent on some of the same things (e.g. power supplies, chassis, SATA controller, air conditioning) then what caused one device to fail may very well cause another device to fail. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Maurice Volaski wrote:>> Perhaps providing the computations rather than the conclusions would >> be more persuasive on a technical list ;> >> > > 2 16-disk SATA arrays in RAID 5 > 2 16-disk SATA arrays in RAID 6 > 1 9-disk SATA array in RAID 5. > > 4 drive failures over 5 years. Of course, YMMV, especially if you > drive drunk :-) >My mileage does vary! On a 4 year old 84 disk array (with 12 RAID 5s), I replace one drive every couple of weeks (on average). This array lives in a proper machine-room with good power and cooling. The array stays active, though. -Luke
On Apr 15, 2008, at 11:18 AM, Bob Friesenhahn wrote:> On Tue, 15 Apr 2008, Keith Bierman wrote: >> >> Perhaps providing the computations rather than the conclusions >> would be more persuasive on a technical list ;> > > No doubt. The computations depend considerably on the size of the > disk drives involved. The odds of experiencing media failure on a > single 1TB SATA disk are quite high. Consider that this media > failure may occur while attempting to recover from a failed disk. > There have been some good articles on this in USENIX Login magazine. > > ZFS raidz1 and raidz2 are NOT directly equivalent to RAID5 and > RAID6 so the failure statistics would be different. Regardless, > single disk failure in a raidz1 substantially increases the risk > that something won''t be recoverable if there is a media failure > while rebuilding. Since ZFS duplicates its own metadata blocks, it > is most likely that some user data would be lost but the pool would > otherwise recover. If a second disk drive completely fails, then > you are toast with raidz1. > > RAID5 and RAID6 rebuild the entire disk while raidz1 and raidz2 > only rebuild existing data blocks so raidz1 and raidz2 are less > likely to experience media failure if the pool is not full. >Indeed; but worked illustrative examples are apt to be more helpful than blanket pronouncements ;> -- Keith H. Bierman khbkhb at gmail.com | AIM kbiermank 5430 Nassau Circle East | Cherry Hills Village, CO 80113 | 303-997-2749 <speaking for myself*> Copyright 2008
Luke Scharf wrote:> Maurice Volaski wrote: > >>> Perhaps providing the computations rather than the conclusions would >>> be more persuasive on a technical list ;> >>> >>> >> 2 16-disk SATA arrays in RAID 5 >> 2 16-disk SATA arrays in RAID 6 >> 1 9-disk SATA array in RAID 5. >> >> 4 drive failures over 5 years. Of course, YMMV, especially if you >> drive drunk :-) >> >> > > My mileage does vary! > > On a 4 year old 84 disk array (with 12 RAID 5s), I replace one drive > every couple of weeks (on average). This array lives in a proper > machine-room with good power and cooling. The array stays active, though. > > -Luke >I basically agree with this. We have about 150TB in mostly RAID 5 configurations, ranging from 8 to 16 disks per volume. We also replace bad drives about every week or three, but in six years, have never lost an array. I think our "secret" is this: on our 3ware controllers we run a verify at a minimum of three times a week. The verify will read the whole array (data and parity), find bad blocks and move them if necessary to good media. Because of this, we''ve never had a rebuild trigger a secondary failure. knock wood. Our server room has conditioned power and cooling as well. Jon
Tim schrieb:> I''m sure you''re already aware, but if not, 22 drives in a raid-6 is > absolutely SUICIDE when using SATA disks. 12 disks is the upper end of > what you want even with raid-6. The odds of you losing data in a 22 > disk raid-6 is far too great to be worth it if you care about your > data. /rantLet''s do some calculations. Based on the specs of http://www.seagate.com/docs/pdf/datasheet/disc/ds_barracuda_es_2.pdf AFR = 0.73% BER = 1:10^15 22 disk RAID-6 with 1TB disks. - The probability of a disk failure is 16.06% p.a. (0.73% * 22) - let''s assume one day array rebuild time (22 * 1TB / 300MB/s) - This means the probability for another disk error during rebuild on the hot spare is is 0.042% (0.73% * (1/365) * 21) - If a second disk fails there is a chance of 16% of an unrecoverable read error on the remaining 20 disks (8 * 20 * 10^12 / 10^15) So the probability for a data loss is: 16.06% * 0.042% * 16.0% = 0.001% p.a. (a little bit higher since I haven''t calculated 3 or more failing disks). The calculations assume an independent failure probability of each disk and correct numbers for AFR and BER. In reality I found the AFR rates of the disk vendors way too optimistic, but the BER rate too pessimistic. If we calculate with AFR = 3% BER = same we end up with with a data loss probability of 0.018% p.a. Daniel
Truly :) I was planning something like 3 pools concatenated. But we are only populating 12 bays at the moment. Blake This message posted from opensolaris.org
Bob Friesenhahn wrote:> On Tue, 15 Apr 2008, Maurice Volaski wrote: > >> 4 drive failures over 5 years. Of course, YMMV, especially if you >> drive drunk :-) >> > > Note that there is a difference between drive failure and media data > loss. In a system which has been running fine for a while, the chance > of a second drive failing during rebuild may be low, but the chance of > block-level media failure is not. >I couldn''t have said it better myself :-). The prevailing studies are clearly showing unrecoverable reads as the most common failure mode.> However, computers do not normally run in a vaccum. Many failures are > caused by something like a power glitch, temperature cycle, or the > flap of a butterfly''s wings. Unless your environment is completely > stable and the devices are not dependent on some of the same things > (e.g. power supplies, chassis, SATA controller, air conditioning) then > what caused one device to fail may very well cause another device to > fail. >Add to this manufacturing vintage. We do see some vintages which have higher incidence rates than others. It is not often practical to get all the disks in a system to be from different vintages, especially on a system like the X4500. -- richard
Jacob Ritorto wrote:> Right, a nice depiction of the failure modes involved and their > probabilities based on typical published mtbf of components and other > arguments/caveats, please? Does anyone have the cycles to actually > illustrate this or have urls to such studies? >Yes, this is what I do for a living. There are several blog postings on this very topic at http://blog.sun.com/relling usually under the ZFS tag. However, what is missing is a RAID-6 vs RAID-Z2 for 3-N disks. I have the data and models, just need to blog it... stay tuned. -- richard
On Apr 15, 2008, at 13:18, Bob Friesenhahn wrote:> ZFS raidz1 and raidz2 are NOT directly equivalent to RAID5 and RAID6 > so the failure statistics would be different. Regardless, single disk > failure in a raidz1 substantially increases the risk that something > won''t be recoverable if there is a media failure while rebuilding. > Since ZFS duplicates its own metadata blocks, it is most likely that > some user data would be lost but the pool would otherwise recover. If > a second disk drive completely fails, then you are toast with raidz1. > > RAID5 and RAID6 rebuild the entire disk while raidz1 and raidz2 only > rebuild existing data blocks so raidz1 and raidz2 are less likely to > experience media failure if the pool is not full.While the failure statistics may be different, I think any comparison would be "apples-to-apples". RAID-5/6 would be the "worst case", with ZFS having slightly better numbers; it having a slight advantage because of checksuming (to head reduce the chance of undetected / unrecoverable read errors) and because you''re only recreating used blocks instead of all blocks (less stress on the disks; shorter rebuild time). There''s also the consistency advantage (no fsck time; no write hole). In general you could probably calculate the numbers for RAID-5/6, and be fairly comfortable thinking that RAID-Z/2 is better than whatever you get. (For some quantity of "better".)
On Wed, 16 Apr 2008, David Magda wrote:>> RAID5 and RAID6 rebuild the entire disk while raidz1 and raidz2 only >> rebuild existing data blocks so raidz1 and raidz2 are less likely to >> experience media failure if the pool is not full. > > While the failure statistics may be different, I think any comparison would > be "apples-to-apples".Note that if the pool is only 10% full, then it is 10X less likely to experience a media failure during rebuild than traditional RAID-5/6 with the same disks. In addition to this, zfs replicates metadata and writes the copies to different disks depending on the redundancy strategy. A traditional filesystem on traditional RAID does not have this same option (having no knowledge of the underlying disks) even though it does replicate some essential metadata (multiple super blocks). Since my time on this list, the vast majority of reports have been of the nature "my pool did not come back up after system crash" or "the pool stopped responding" and not that their properly redundant pool lost some user data. This indicates that the storage principles are quite sound but the implementation (being relatively new) still has a few rough edges. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Hello Blake, did you end up purchasing this ? We''re considering buying a SilMech K501 as our new fileserver with a pair of Areca controllers in JBOD mode. Any experience would be appreciated. Thanks, Christophe Dupre Blake Irvin wrote:> The only supported controller I''ve found is the Areca ARC-1280ML. I want to put it in one of the 24-disk Supermicro chassis that Silicon Mechanics builds. > > Has anyone had success with this card and this kind of chassis/number of drives? > > cheers, > Blake > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
We are currently using the 2-port Areca card SilMech offers for boot, and 2 of the Supermicro/Marvell cards for our array. Silicon Mechanics gave us great support and burn-in testing for Solaris 10. Talk to a sales rep there and I don''t think you will be disappointed. cheers, Blake This message posted from opensolaris.org
I bought similar kit from them, but when I received the machine, uninstalled, I looked at the install manual for the Areca card and found that it''s a manual driver add that is documented to _occasionally hang_ and you have to _kill it off manually_ if it does. I''m really not having that in a production server, so as soon as I saw this, I asked them (SiMech) for an alternative, production worthy solution, but they''ve not yet responded at all (after three emailings). The pre-sale communication, however, was superb. Perhaps this sort of driver munging behaviour is perfectly acceptable in the ms/linux world and they therefore think I''m being too fussy? Hmm. Anyway, to get this Areca card as far away from me as possible, I plan to either go with a small zfs CompactFlash array on the mobo ide channels (or) just use one channel of each of my two supermicro/marvell boards. Any opinions out there on that plan, btw? This is slated to be a first tier production file/iscsi server (!).. Sorry if this is drifting too far off topic, but I really would have appreciated this sort of info when searching for a zfs hw solution smaller than thumper. That said, why, oh why, does Sun not offer a Niagara board on a 16 disk chassis for $10000 for our little corner of the market? Dealing with the concept of using PCs in production has been absolutely horrifying thus far. thx jake On Fri, Jun 27, 2008 at 3:59 PM, Blake Irvin <blake.irvin at gmail.com> wrote:> We are currently using the 2-port Areca card SilMech offers for boot, and 2 of the Supermicro/Marvell cards for our array. Silicon Mechanics gave us great support and burn-in testing for Solaris 10. Talk to a sales rep there and I don''t think you will be disappointed. > > cheers, > Blake > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Hmm. That''s kind of sad. I grabbed the latest Areca drivers and haven''t had a speck of trouble. Was the driver revision specified in the docs you read the latest one? Flash boot does seem nice in a way, since Solaris writes to the boot volume so seldom on a machine that has enough RAM to avoid swapping. I think Sun needs to offer something in the 5k range as well :) Blake This message posted from opensolaris.org