After having read this mailing list for a little while, I get the impression that there are at least some people who regularly experience on-disk corruption that ZFS should be able to report and handle. I?ve been running a raidz1 on three 1TB consumer disks for approx. 2 years now (about 90% full), and I scrub the pool every 3-4 weeks and have never had a single error. From the oft-quoted 10^14 error rate that consumer disks are rated at, I should have seen an error by now -- the scrubbing process is not the only activity on the disks, after all, and the data transfer volume from that alone clocks in at almost exactly 10^14 by now. Not that I?m worried, of course, but it comes at a slight surprise to me. Or does the 10^14 rating just reflect the strength of the on-disk ECC algorithm?
2012-01-24 19:50, Stefan Ring ?????:> After having read this mailing list for a little while, I get the > impression that there are at least some people who regularly > experience on-disk corruption that ZFS should be able to report and > handle. I?ve been running a raidz1 on three 1TB consumer disks for > approx. 2 years now (about 90% full), and I scrub the pool every 3-4 > weeks and have never had a single error. From the oft-quoted 10^14 > error rate that consumer disks are rated at, I should have seen an > error by now -- the scrubbing process is not the only activity on the > disks, after all, and the data transfer volume from that alone clocks > in at almost exactly 10^14 by now. > > Not that I?m worried, of course, but it comes at a slight surprise to > me. Or does the 10^14 rating just reflect the strength of the on-disk > ECC algorithm?I maintained several dozen storage servers for about 12 years, and I''ve seen quite a few drive deaths as well as automatically triggered RAID array rebuilds. But usually these were "infant deaths" in the first year, and those drives who passed the age test often give no noticeable problems for the next decade. Several 2-4 disk systems work as OpenSolaris SXCE servers with ZFS pools for root and data for years now, and also show now problems. However most of these are branded systems and disks from Sun. I think we''ve only had one or two drives die, but happened to have cold-spares due to over-ordering ;) I do have a suspiciously high error rate on my home-NAS which was thrown together from whatever pieces I had at home at the time I left for an overseas trip. The box is nearly unmaintained since then, and can suffer from physical reasons known and unknown, such as the SATA cabling (varied and quite possibly bad), non-ECC memory, dust and overheating, etc. It is also possible that aging components such as the CPU and Motherboard which have about 5 years of active lifetime (including an overclocked past) can contribute to error-rates. The old 80gb root drive has had some bad sectors (READ errors in scrub and data access) and rpool was recreated with copies=2 for a few times now, thanks to LiveUSB, but the main data pool had no substantial errors until the CKSUM errors reported this winter (metadata:0x0 and then the dozen of in-file checksum mismatches). Since one of the drives got itself lost soon after, and only reappeared after all the cables were replugged, I still tend to blame this on SATA cabling as the most probable root cause. I do not have an up-to-date SMART error report, and the box is not accessible at the moment, so I can''t comment on lower-level errors in the main pool drives. They were new at the time I put the box together (almost a year ago now). However, so far much more than discovered on-disk CKSUM errors (whichever way they''ve appeared) I am bothered by tendency of this box to lock up and/or reboot after somewhat repeatable actions (such as destroying large snapshots of deduped datasets, etc.) I tend to write this off as shortcomings of the OS (i.e. memory-hunger and lockup in scarate hell as the most frequent cause), and this really bothers me more now - causing lots of downtime until some friend comes to that apartment to reboot the box. > Or does the 10^14 rating just reflect the strength > of the on-disk ECC algorithm? I am not sure how much the algorithms differ between "enterprise" and "consumer" disks, while the UBER is said to differ about 100 times. It might have also to do with quality of materials (better steel in ball bearings, etc.) as well as better firmware/processors which optimize mechanical workloads and reduce the mechanical wear. Maybe so, at least... Finally, this is statistics. It does not "guarantee" that for some 90Tbits of transferred data you will certainly see an error (and just one for that matter). Those drives which died young hopefully also count in the overall stats, moving the bar a bit higher for their better-made brethren. Also, disk UBER regards media failures and ability of disks'' cache, firmware and ECC to deal with that. After the disk sends the "correct" sector on the wire, many things can happen like noise in bad connectors, electromagnetic interference from all the motors in your computer onto the data cable, ability or lack thereof for the data protocol (IDE, ATA, SCSI) to detect and/or recover from such incoming random bits between disk and HBA, errors in HBA chips and code, noise in old rusty PCI* connector slots, bitflips in non-ECC RAM or overheated CPUs, power surges from PSU... There is a lot of stuff that can break :) //Jim Klimov
On Tue, 24 Jan 2012, Jim Klimov wrote:> >> Or does the 10^14 rating just reflect the strength >> of the on-disk ECC algorithm? > > I am not sure how much the algorithms differ between > "enterprise" and "consumer" disks, while the UBER is > said to differ about 100 times. It might have also > to do with quality of materials (better steel in ball > bearings, etc.) as well as better firmware/processors > which optimize mechanical workloads and reduce the > mechanical wear. Maybe so, at least...In addition to the above, an important factor is that enterprise disks with 10^16 ratings also offer considerably less storage density. Instead of 3TB storage per drive, you get 400GB storage per drive. So-called "nearline" enterprise storage drives fit in somewhere in the middle, with higher storage densities, but also higher error rates. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
What I''ve noticed, is that when I have my drives in a situation of small airflow, and hence hotter operating temperatures, my disks will drop quite quickly. I''ve now moved my systems into large cases, which large amounts of airflow and using the icydock brand of removable drive enclosures. http://www.newegg.com/Product/Product.aspx?Item=N82E16817994097 http://www.newegg.com/Product/Product.aspx?Item=N82E16817994113 I use the SASUC8I SATA/SAS controller to access 8 drives. http://www.newegg.com/Product/Product.aspx?Item=N82E16816117157 I put it in PCI-e x16 slots on "graphics heavy" motherboards which might have as many as 4x PCI-e x16 slots. I am replacing an old motherboard with this one. http://www.tigerdirect.com/applications/SearchTools/item-details.asp?EdpNo=1124780 The case that I found to be a good match for my needs is the Raven http://www.newegg.com/Product/Product.aspx?Item=N82E16811163180 It has enough slots (7) to put 2x 3-in-2 and 1x 4-in-3 icy dock bays in to provide 10 drives in hot swap bays. I really think that the big issue is that you must move the air. The drives really need to stay cool or else you will see degraded performance and/or data loss much more often. Gregg Wonderly On 1/24/2012 9:50 AM, Stefan Ring wrote:> After having read this mailing list for a little while, I get the > impression that there are at least some people who regularly > experience on-disk corruption that ZFS should be able to report and > handle. I?ve been running a raidz1 on three 1TB consumer disks for > approx. 2 years now (about 90% full), and I scrub the pool every 3-4 > weeks and have never had a single error. From the oft-quoted 10^14 > error rate that consumer disks are rated at, I should have seen an > error by now -- the scrubbing process is not the only activity on the > disks, after all, and the data transfer volume from that alone clocks > in at almost exactly 10^14 by now. > > Not that I?m worried, of course, but it comes at a slight surprise to > me. Or does the 10^14 rating just reflect the strength of the on-disk > ECC algorithm? > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On 01/24/12 17:06, Gregg Wonderly wrote:> What I''ve noticed, is that when I have my drives in a situation of small > airflow, and hence hotter operating temperatures, my disks will drop > quite quickly.While I *believe* the same thing and thus have over provisioned airflow in my cases (for both drives and memory), there are studies which failed to find a strong correlation between drive temperature and failure rates: http://research.google.com/archive/disk_failures.pdf http://www.usenix.org/events/fast07/tech/schroeder.html
Anonymous Remailer (austria)
2012-Jan-25 09:08 UTC
[zfs-discuss] What is your data error rate?
I''ve been watching the heat control issue carefully since I had to take a job offshore (cough reverse H1B cough) in a place without adequate AC and I was able to get them to ship my servers and some other gear. Then I read Intel is guaranteeing their servers will work up to 100 degrees F ambient temps in the pricing wars to sell servers, he who goes green and saves data center cooling budget will win big since now everyone realizes AC costs more than hardware for server farms. And this is not on new special heat-tolerant gear, I heard they will put this in writing even for their older units. From that I would conclude at least commercial server gear can take a lot more abuse than it gets and still not be affected enough to make components fail because if they did, Intel could not afford to make this guarantee. YMMV of course. I still feel nervous running equipment in this kind of environment but after 3 years of doing that including commodity desktops I haven''t seen any abnormal failures.
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Stefan Ring > > I?ve been running a raidz1 on three 1TB consumer disks for > approx. 2 years now (about 90% full), and I scrub the pool every 3-4 > weeks and have never had a single error.Well... You''re probably not 100% active 100% of the time... And... Assuming the failure rate of drives is not linear, but skewed toward higher failure rate after some period of time (say, 3 yrs) then you''re more likely to experience no errors for the first year or two, and you''re more likely to experience multiple simultaneous failures after 3yrs or so.
On 01/25/12 09:08, Edward Ned Harvey wrote:> Assuming the failure rate of drives is not linear, but skewed toward higher failure rate after some period of time (say, 3 yrs) ...See section 3.1 of the Google study: http://research.google.com/archive/disk_failures.pdf although section 4.2 of the Carnegie Mellon study is much more supportive of the assumption. http://www.usenix.org/events/fast07/tech/schroeder/schroeder.pdf
On Wed, 25 Jan 2012, Anonymous Remailer (austria) wrote:> > I''ve been watching the heat control issue carefully since I had to take a > job offshore (cough reverse H1B cough) in a place without adequate AC and I > was able to get them to ship my servers and some other gear. Then I read > Intel is guaranteeing their servers will work up to 100 degrees F ambient > temps in the pricing wars to sell servers, he who goes green and saves dataMost servers seem to be specified to run up to 95 degrees, with some particularly-dense ones specified to only handle 90. Network switching gear is usually specified to handle 105. My own equipment typically experiences up to 83 degrees during the peak of summer (but quite a lot more if the AC fails). Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Tue, Jan 24, 2012 at 10:50 AM, Stefan Ring <stefanrin at gmail.com> wrote:> After having read this mailing list for a little while, I get the > impression that there are at least some people who regularly > experience on-disk corruption that ZFS should be able to report and > handle. I?ve been running a raidz1 on three 1TB consumer disks for > approx. 2 years now (about 90% full), and I scrub the pool every 3-4 > weeks and have never had a single error. From the oft-quoted 10^14 > error rate that consumer disks are rated at, I should have seen an > error by now -- the scrubbing process is not the only activity on the > disks, after all, and the data transfer volume from that alone clocks > in at almost exactly 10^14 by now.The 10^-14 (or 10^-15 or 10^-16) number is a statistical average. So if you have a big enough pool of drives, for every drive that moves more than 10^14 with no uncorrectable errors, then there will be a drive that moves less than 10^14 before hitting an uncorrectable error. The three 1 TB consumer drives you have must have been manufactured on a "good day" and not a "bad day" :-) Note the error rate is 10^-14 (or 10^-15 or 10^-15) which translates into one error per 10^14 bits (bytes ?) transferred to / from the drive. Note the sign change on the exponent :-) -- {--------1---------2---------3---------4---------5---------6---------7---------} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, Troy Civic Theatre Company -> Technical Advisor, RPI Players