I''m new to this group, so hello everyone! I am having some issues with my Nexenta system I set up about two months ago as a zfs/zraid server. I have two new Maxtor 500GB Sata drives and an Adaptec controller which I believe has a Silicon Image chipset. Also I have a Seasonic 80+ power supply, so the power should be as clean as you can get. I had an issue with Nexenta where I had to reinstall, and since then everytime I reboot I have to type zpool export amber zpool import amber to get my zfs volume mounted. A week ago I noticed a couple of CKSUM errors when I did a zpool status, so I did a zpool scrub. This is the output after: # zpool status pool: amber state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: scrub completed with 0 errors on Mon Nov 13 04:49:35 2006 config: NAME STATE READ WRITE CKSUM amber ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c4d0 ONLINE 0 0 51 c5d0 ONLINE 0 0 41 errors: No known data errors I have md5sums on a lot of the files and it looks like maybe 5% of my files are corrupted. Does anyone have any ideas? I was under the impression that zfs was pretty reliable but I guess with any software it needs time to get the bugs ironed out. Michael
On 11/18/06, zfs at michael.mailshell.com <zfs at michael.mailshell.com> wrote: ...> scrub: scrub completed with 0 errors on Mon Nov 13 04:49:35 2006 > config: > > NAME STATE READ WRITE CKSUM > amber ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > c4d0 ONLINE 0 0 51 > c5d0 ONLINE 0 0 41 > > errors: No known data errors > > > I have md5sums on a lot of the files and it looks like maybe 5% of > my files are corrupted. Does anyone have any ideas?Michael, as far as I can see, your setup does not mee the minimum redundancy requirements for a Raid-Z, which is 3 devices. Since you only have 2 devices you are out on a limb. Please read the manpage for the zpool command and pay close attention to the restrictions in the section on raidz.> I was under the impression that zfs was pretty reliable but I > guess with any software it needs time to get the bugs ironed out.ZFS is reliable. I use it - mirrored - at home. If I was going to use raidz or raidz2 I would make sure that I followed the instructions in the manpage about the number of devices I need in order to guarantee redundancy and thus reliability, rather than making an assumption. You should also check the output of "iostat -En" and see whether your devices are listed there with any error counts. James C. McPherson -- Solaris kernel software engineer, system admin and troubleshooter http://www.jmcp.homeunix.com/blog Find me on LinkedIn @ http://www.linkedin.com/in/jamescmcpherson
Hi Michael. Based on the output, there should be no user-visible file corruption. ZFS saw a bunch of checksum errors on the disk, but was able to recover in every instance. While 2-disk RAID-Z is really a fancy (and slightly more expensive, CPU-wise) way of doing mirroring, at no point should your data be at risk. I''ve been working on ZFS a long time, and if what you say is true, it will be the first instance I have ever seen (or heard) of such a phenomenon. I strongly doubt that somehow ZFS returned corrupted data without knowing about it. How are you sure that some application on your box didn''t modify the contents of the files? --Bill On Sat, Nov 18, 2006 at 02:01:39AM -0800, zfs at michael.mailshell.com wrote:> I''m new to this group, so hello everyone! I am having some issues with my Nexenta system I set up about two months ago as a zfs/zraid server. I have two new Maxtor 500GB Sata drives and an Adaptec controller which I believe has a Silicon Image chipset. Also I have a Seasonic 80+ power supply, so the power should be as clean as you can get. I had an issue with Nexenta where I had to reinstall, and since then everytime I reboot I have to type > > zpool export amber > zpool import amber > > to get my zfs volume mounted. A week ago I noticed a couple of CKSUM errors when I did a zpool status, so I did a zpool scrub. This is the output after: > > # zpool status > pool: amber > state: ONLINE > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using ''zpool clear'' or replace the device with ''zpool replace''. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: scrub completed with 0 errors on Mon Nov 13 04:49:35 2006 > config: > > NAME STATE READ WRITE CKSUM > amber ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > c4d0 ONLINE 0 0 51 > c5d0 ONLINE 0 0 41 > > errors: No known data errors > > > I have md5sums on a lot of the files and it looks like maybe 5% of my files are corrupted. Does anyone have any ideas? I was under the impression that zfs was pretty reliable but I guess with any software it needs time to get the bugs ironed out. > > Michael > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On 18-Nov-06, at 2:01 PM, Bill Moore wrote:> Hi Michael. Based on the output, there should be no user-visible file > corruption. ZFS saw a bunch of checksum errors on the disk, but was > able to recover in every instance. > > While 2-disk RAID-Z is really a fancy (and slightly more expensive, > CPU-wise) way of doing mirroring, at no point should your data be at > risk. > > I''ve been working on ZFS a long time, and if what you say is true, it > will be the first instance I have ever seen (or heard) of such a > phenomenon. I strongly doubt that somehow ZFS returned corrupted data > without knowing about it.Also, I''d check your RAM. --Toby> How are you sure that some application on > your box didn''t modify the contents of the files? > > > --Bill > > > On Sat, Nov 18, 2006 at 02:01:39AM -0800, zfs at michael.mailshell.com > wrote: >> I''m new to this group, so hello everyone! I am having some issues >> with my Nexenta system I set up about two months ago as a zfs/ >> zraid server. I have two new Maxtor 500GB Sata drives and an >> Adaptec controller which I believe has a Silicon Image chipset. >> Also I have a Seasonic 80+ power supply, so the power should be as >> clean as you can get. I had an issue with Nexenta where I had to >> reinstall, and since then everytime I reboot I have to type >> >> zpool export amber >> zpool import amber >> >> to get my zfs volume mounted. A week ago I noticed a couple of >> CKSUM errors when I did a zpool status, so I did a zpool scrub. >> This is the output after: >> >> # zpool status >> pool: amber >> state: ONLINE >> status: One or more devices has experienced an unrecoverable >> error. An >> attempt was made to correct the error. Applications are >> unaffected. >> action: Determine if the device needs to be replaced, and clear >> the errors >> using ''zpool clear'' or replace the device with ''zpool >> replace''. >> see: http://www.sun.com/msg/ZFS-8000-9P >> scrub: scrub completed with 0 errors on Mon Nov 13 04:49:35 2006 >> config: >> >> NAME STATE READ WRITE CKSUM >> amber ONLINE 0 0 0 >> raidz1 ONLINE 0 0 0 >> c4d0 ONLINE 0 0 51 >> c5d0 ONLINE 0 0 41 >> >> errors: No known data errors >> >> >> I have md5sums on a lot of the files and it looks like maybe 5% of >> my files are corrupted. Does anyone have any ideas? I was under >> the impression that zfs was pretty reliable but I guess with any >> software it needs time to get the bugs ironed out. >> >> Michael >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Sat, 18 Nov 2006 zfs at michael.mailshell.com wrote:> I''m new to this group, so hello everyone! I am having some issues withWelcome!> my Nexenta system I set up about two months ago as a zfs/zraid server. I > have two new Maxtor 500GB Sata drives and an Adaptec controller which I > believe has a Silicon Image chipset. Also I have a Seasonic 80+ power > supply, so the power should be as clean as you can get. I had an issueJust wondering (out loud) if your PSU is capable of meeting the demands of your current hardware - including the zfs related disk drives you just added and if the system is on a UPS. Just questions for you to answer and off topic for this list. But you''ll see that this thought process is relevant to your particular problem - see more below.> with Nexenta where I had to reinstall, and since then everytime I reboot > I have to type > > zpool export amber > zpool import amber > > to get my zfs volume mounted. A week ago I noticed a couple of CKSUM > errors when I did a zpool status, so I did a zpool scrub. This is the > output after: > > # zpool status > pool: amber > state: ONLINE > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using ''zpool clear'' or replace the device with ''zpool replace''. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: scrub completed with 0 errors on Mon Nov 13 04:49:35 2006 > config: > > NAME STATE READ WRITE CKSUM > amber ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > c4d0 ONLINE 0 0 51 > c5d0 ONLINE 0 0 41 > > errors: No known data errors > > > I have md5sums on a lot of the files and it looks like maybe 5% of my > files are corrupted. Does anyone have any ideas? I was under the > impression that zfs was pretty reliable but I guess with any software it > needs time to get the bugs ironed out.[ I''ve seen the response where one astute list participate noticed you''re running a 2-way raidz device, when the documentation clearly states that the mimimum raidz volume consists of 3 devices ] Going back to zero day (my terminology) for ZFS, when it was first integrated, if you read the zfs related blogs, you''ll realize that zfs is arguably one of the most extensively tested bodies of software _ever_ added to (Open)Solaris. If there was a basic issue with zfs, like you describe above, zfs would never have been integrated (into (Open)Solaris). You can imagine that there were a lot of willing zfs testers ("please can I be on the beta test...")[0] - but there were also a few cases of "this issue has *got* to be ZFS related" - because there were no other _rational_ explanations. One such case is mentioned here: http://blogs.sun.com/roller/page/elowe?anchor=zfs_saves_the_day_ta I would suggest that you look for some basic hardware problems within your system. The first place to start is to download/burn a copy of the Ultimate Boot CD ROM (UBCD) [1] and run the latest version of memtest memtest86 for 24 hours. It''s likely that you have hardware issues. Please keep the list informed.... [0] including this author who built hardware specifically to eval/test/use ZFS and get it into production ASAP to solve a business storage problem for $6k instead of $30k to $40k. [1] http://www.ultimatebootcd.com/ Regards, Al Hopper Logical Approach Inc, Plano, TX. al at logical-approach.com Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 OpenSolaris Governing Board (OGB) Member - Feb 2006
> [ I''ve seen the response where one astute list participate noticed you''re > running a 2-way raidz device, when the documentation clearly states that > the mimimum raidz volume consists of 3 devices ]Not very astute. The documentation clearly states that the minimum is 2 devices. zpool(1M): A raidz group with N disks of size X can hold approxi- mately (N-1)*Xbytes and can withstand one device failing before data integrity is compromised. The minimum number of devices in a raidz group is 2. The recommended number is between 3 and 9. If the minimum were actually 3, this configuration wouldn''t work at all. -frank
First thing is I would like to thank everyone for their replies/help. This machine has been running for two years under Linux, but for last two or three months has had Nexenta Solaris on it. This machine has never once crashed. I rebooted with a Knoppix disk in and ran memtest86. Within 30 minutes it counted several hundred errors which after cleaning the connections still occurred in the same locations. I replaced the RAM module and retested with no errors. My md5sums all verified no data was lost making me very happy. I did a zpool scrub which came back 100% clean. I still don''t understand how the machine ran reliably with bad ram. That being said, a few days later I did a zpool status and saw 20 checksum errors on one drive and 30 errors on the other. Does anyone have any idea why I have to do "zpool export amber" followed by "zpool import amber" for my zpool to be mounted on reboot? zfs set mountpoint does nothing. BTW to answer some other concerns, the Seasonic supply is 400Watts with a guaranteed minimum efficency of 80%. Using a kill-o-watt meter I have about 120Watts power consumption. The machine is on a UPS.
Hi I''ll recommend going over the zfs presentation. One of the points they listed was that - even in case of silent errors (like you noticed) other systems just go on. Your data gets silently corrupted and you''d never notice it. If there are few bit flips in jpegs and movie files, it will almost never be noticeable. However, there are places where it will cause catastrophy but in day-to-day cases we don''t come across or even if we do - we attribute them to $CAUSE, forget and go on. ZFS tries to fix this problem as one of its core goals. (that is why block checksums are there). Rest assured, zfs + solaris has only uncovered and made it uncomfortably evident the problem that has been so far latent. Now the uncovering itself may cause you pains is a different issue. Ignorance is bliss for most of the humans :-) This message posted from opensolaris.org
On 11/26/06, Akhilesh Mritunjai <mritun+opensolaris at gmail.com> wrote:> I''ll recommend going over the zfs presentation. One of the points they listed was that - even in case of silent errors (like you noticed) other systems just go on. Your data gets silently corrupted and you''d never notice it. If there are few bit flips in jpegs and movie files, it will almost never be noticeable. However, there are places where it will cause catastrophy but in day-to-day cases we don''t come across or even if we do - we attribute them to $CAUSE, forget and go on. ZFS tries to fix this problem as one of its core goals. (that is why block checksums are there).The fact that ZFS will detect and report errors that other systems silently gloss over is fairly well documented at this point, and it''s a big win for ZFS, and part of my motivation for running it. However, what you say about bit flips in jpegs, at least, is misleading. If you never open the file you won''t notice -- but that''s true for *any* file, of course! If you *do* open the file, everything after the flipped bit will be drastically altered, or completely unreadable. I''ve viewed a number of damaged jpegs, and the visible consequences are always really drastic. Now, in an uncompressed TIFF file, it''d be mostly invisible, because it would affect only one pixel. The issue is that jpeg is a heavily compressed format; the next data always depends on the previous data, so everything after an error is changed. -- David Dyer-Bennet, <mailto:dd-b at dd-b.net>, <http://www.dd-b.net/dd-b/> RKBA: <http://www.dd-b.net/carry/> Pics: <http://www.dd-b.net/dd-b/SnapshotAlbum/> Dragaera/Steven Brust: <http://dragaera.info/>
On Sat, 25 Nov 2006 zfs at michael.mailshell.com wrote: .... reformatted ...> First thing is I would like to thank everyone for their replies/help. > This machine has been running for two years under Linux, but for last^^^^^^^^^ Ugh Oh - possible CPU fan "fatigue" time... more below.> two or three months has had Nexenta Solaris on it. This machine has > never once crashed. I rebooted with a Knoppix disk in and ran memtest86. > Within 30 minutes it counted several hundred errors which after cleaning > the connections still occurred in the same locations. I replaced the RAM > module and retested with no errors. My md5sums all verified no data was > lost making me very happy. I did a zpool scrub which came back 100% > clean. I still don''t understand how the machine ran reliably with bad > ram. That being said, a few days later I did a zpool status and saw 20 > checksum errors on one drive and 30 errors on the other.You''re still chasing a hardware issue(s) IMHO. First, ensure that you are blowing air over the HDA (Head Disk Assembly) of your installed hard drives. The drives don''t care if the airflow is from back to front, left to right, right to left, front to back etc. And it does not have to be a lot of air. As long as there is positive airflow over the HDA and the disk drive controller electronics. Otherwise, it''s likely that the disk drives will overheat while there is a lot of head movement taking place. My suggestion is to get a 92mm fan(s) with a hard disk type connector and jury rig the fan(s) to blow air accross the drives. Do whatever it takes to secure the fans in position - bent wire hangers secured to the case will work! It may not look pretty - but it''ll get the job done. Or .. mount the drives in drive cannisters with built-in fans. Next is to check for hotspots within the box. Check the memory SIMMs are getting good airflow. A great way to resolve this type of issue is to use the Zalman Fan Bracket (FB123) and one or more 92mm fans. The bracket itself is hard to explain - but it allows you to attach up to 4 fans in slots and position them above anything that is a hot-spot - including, the motherboard chipset, RAM SIMMs, graphics boards, gigabit ethernet cards etc. A picture is worth a 1000 words: http://www.endpcnoise.com/cgi-bin/e/std/sku=fb123 Note: this is not an endorsement of this site - just a good picture - since the Zalman site (zalmanusa.com) is a pain to navigate. Still on the cooling thread - the Seasonic PSUs are highly rated and very quiet. But ... they don''t move enough air through your box and should be suplemented with an intake fan (if you box has provision to add one) and a rear panel mounted exhaust fan. Many PC users have upgraded their PSUs and been careful to select a quiet PSU - but they did not realize that the quiet PSU, with its slow moving fan, greatly reduced the existing airflow through the box. The PSU can run effectively with the reduced airflow - but not the other components in the system. If you want to apply science and actually measure your box for hotspots, I suggest you run the box at the usual ambient temp, with the usual active workload then carefully remove the side cover very quickly (while the box is still running) and use a Fluke IR (Infra Red) thermal probe[1] to measure for hot spots. Record the CPU heatsink temp, RAM DIMMs, HDA, motherboard chipset etc. You can also busy out the box by running SETI and/or beat up on the disk drives and take more measurements[2]. Then after you apply the fixes ... retest. A couple of pointers that may help. If your box has an 80mm exhaust fan - replace it with a 92mm (or 120mm) fan and use a plastic 90mm to 80mm adaptor. This''ll increase airflow without increasing the noise. Also, Zalman makes a small "gizmo" that you put inline with a fan, that allows you to vary the fan speed and set the speed to get the best noise/cooling tradeoff for your box. Its called the "fan mate 2". Last item on cooling (sorry) - many older systems that used small CPU fan based coolers, die after only 2 years. But in many cases, the fan does not actually stop turning - but slows down dramatically. And, sometimes it''ll slow down only after it heats up a little. So if you take the side cover off after the system has been running for a couple of hours, you''ll see the fan turning slowly - and touching the CPU heatsink will probably burn your finger. If you check it a minute after first powering up the system - it''ll look normal and completely fool you. When this happens (fan slows down), the CPU temp will increase, it''s thermal resistance will go lower, and it''ll draw more current ... which will generate even more heat. This is the classic symptom of what we call "thermal runaway". A slightly more subtle variant of this issue, is with the AMD factory based coolers. After you remove the CPU heatsink fan, you''ll notice a lot of dirt/dust blocking up to 1/2 the area of the heatsink and greatly reducing the airflow. But ... you *have to* remove the fan to actually see this. [3] If you have this issue, I suggest you replace the (AMD) factory cooler with a Zalman product. [4] In general, (Open)Solaris is a great system *exerciser*. It will usually flush out marginal hardware that appears to work just fine with other, "impaired" Operating Systems.> Does anyone have any idea why I have to do "zpool export amber" followed > by "zpool import amber" for my zpool to be mounted on reboot? zfs set > mountpoint does nothing.This may be a issue unique to Nexenta - I don''t know. First get the hardware completely rock-solid - then look for the software issues.> BTW to answer some other concerns, the Seasonic supply is 400Watts with > a guaranteed minimum efficency of 80%. Using a kill-o-watt meter I have > about 120Watts power consumption. The machine is on a UPS. >[1] I use an older model which requires a separate DMM (digital multi-meter) with 1/10 of a milli-volt resolution. A Fluke DMM of course! But now the "Fluke 60 Series Handheld Infrared Thermometers" are accurate and affordable. For example: http://www.testequipmentdepot.com/fluke/thermometers/62.htm [2] but don''t do this until you''ve determined that you have reasonable airflow within the box or you''ll probably damage something. [3] Email me offlist with your motherboard and CPU type and I can probably make a recommendation. [4] I proposed this solution to a user on the solarisx86 at yahoogroups.com list - and it resolved his problem. His problem - the system would reset after getting about 1/2 way through a Solaris install. The installer was simply acting as a good system exerciser and heating up his CPU until it glitched out. After he removed the CPU fan and cleaned up his heatsink - he loaded up Solaris successfully. Regards, Al Hopper Logical Approach Inc, Plano, TX. al at logical-approach.com Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 OpenSolaris Governing Board (OGB) Member - Feb 2006
On 11/26/06, Al Hopper <al at logical-approach.com> wrote:> [4] I proposed this solution to a user on the solarisx86 at yahoogroups.com > list - and it resolved his problem. His problem - the system would reset > after getting about 1/2 way through a Solaris install. The installer was > simply acting as a good system exerciser and heating up his CPU until it > glitched out. After he removed the CPU fan and cleaned up his heatsink - > he loaded up Solaris successfully.I just identified and fixed exactly this symptom on my mother''s Windows system, in fact; it''d get half-way through an install, then start getting flakier and flakier, and fairly soon refuse to boot at all. This made me think "heat", and on examination the fan on the CPU cooler wasn''t spinning *at all*. It''s less than two years old -- but one of the three wires seems to be broken off right at the fan, so that may be the problem. It''s not seized up physically, though it''s a bit stiff. Anyway, while the software here isn''t Solaris, the basic diagnostic issue is the same. This kind of thing is remarkably common, in fact! This one has a nearly-good ending, since nothing appears to have cooked enough to be permanently ruined. Only nearly-good because I had to bend the heatsink to get the replacement 70mm fan to fit; the screw holes lined up, but the new one was physically slightly too large, about a mm, to fit on the heatsink. -- David Dyer-Bennet, <mailto:dd-b at dd-b.net>, <http://www.dd-b.net/dd-b/> RKBA: <http://www.dd-b.net/carry/> Pics: <http://www.dd-b.net/dd-b/SnapshotAlbum/> Dragaera/Steven Brust: <http://dragaera.info/>
David Dyer-Bennet wrote:> On 11/26/06, Al Hopper <al at logical-approach.com> wrote: > >> [4] I proposed this solution to a user on the solarisx86 at yahoogroups.com >> list - and it resolved his problem. His problem - the system would reset >> after getting about 1/2 way through a Solaris install. The installer was >> simply acting as a good system exerciser and heating up his CPU until it >> glitched out. After he removed the CPU fan and cleaned up his heatsink - >> he loaded up Solaris successfully. > > I just identified and fixed exactly this symptom on my mother''s > Windows system, in fact; it''d get half-way through an install, then > start getting flakier and flakier, and fairly soon refuse to boot at > all. This made me think "heat", and on examination the fan on the CPU > cooler wasn''t spinning *at all*. It''s less than two years old -- but > one of the three wires seems to be broken off right at the fan, so > that may be the problem. It''s not seized up physically, though it''s a > bit stiff. > > Anyway, while the software here isn''t Solaris, the basic diagnostic > issue is the same. This kind of thing is remarkably common, in fact!Yep, the top 4 things that tend to break are: fans, power supplies, disks, and memory (in no particular order). The enterprise-class systems should monitor the fan speed and alert when they are not operating normally. -- richard
Hello James, Saturday, November 18, 2006, 11:34:52 AM, you wrote: JM> as far as I can see, your setup does not mee the minimum JM> redundancy requirements for a Raid-Z, which is 3 devices. JM> Since you only have 2 devices you are out on a limb. Actually only two disks for raid-z is fine and you get redundancy. However it would make more sense to do mirror with just two disk - performance would be better and available space would be the same. -- Best regards, Robert mailto:rmilkowski at task.gda.pl http://milek.blogspot.com