Uwe Dippel
2009-Apr-15 14:32 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
My question is related to this: # zpool status pool: rpool state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: scrub completed after 0h46m with 0 errors on Tue Apr 14 00:19:34 2009 config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c1d0s0 ONLINE 0 0 1 errors: No known data errors Since it is a rather new drive and has no trouble with Ubuntu, I dared to clean it: # zpool clear rpool and checked it for errors: # zpool scrub rpool # zpool status -v pool: rpool state: ONLINE scrub: scrub completed after 0h47m with 0 errors on Tue Apr 14 23:53:48 2009 config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c1d0s0 ONLINE 0 0 0 errors: No known data errors Now I wonder where that error came from. It was just a single checksum error. It couldn''t go away with an earlier scrub, and seemingly left no traces of badness on the drive. Something serious? At least it looks a tad contradictory: "Applications are unaffected.", it is unrecoverable, and once cleared, there is no error left. Curious, Uwe
Cindy.Swearingen at Sun.COM
2009-Apr-15 15:05 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Hi Uwe, You can use the fmdump feature to help determine whether these disk errors are persistent. Using fmdump -ev will provide a lot of detail but you can review how many disks errors have occurred and for how long. A brief description is provided here: http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide#Diagnosing_Potential_Problems Cindy Uwe Dippel wrote:> My question is related to this: > > # zpool status > pool: rpool > state: ONLINE > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using ''zpool clear'' or replace the device with ''zpool replace''. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: scrub completed after 0h46m with 0 errors on Tue Apr 14 00:19:34 > 2009 > config: > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c1d0s0 ONLINE 0 0 1 > errors: No known data errors > > Since it is a rather new drive and has no trouble with Ubuntu, I dared > to clean it: > # zpool clear rpool > and checked it for errors: > # zpool scrub rpool > # zpool status -v > pool: rpool > state: ONLINE > scrub: scrub completed after 0h47m with 0 errors on Tue Apr 14 23:53:48 > 2009 > config: > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c1d0s0 ONLINE 0 0 0 > errors: No known data errors > > Now I wonder where that error came from. It was just a single checksum > error. It couldn''t go away with an earlier scrub, and seemingly left no > traces of badness on the drive. Something serious? At least it looks a > tad contradictory: "Applications are unaffected.", it is unrecoverable, > and once cleared, there is no error left. > > Curious, > > Uwe > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Richard Elling
2009-Apr-15 15:23 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Uwe Dippel wrote:> My question is related to this: > > # zpool status > pool: rpool > state: ONLINE > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the > errors > using ''zpool clear'' or replace the device with ''zpool replace''. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: scrub completed after 0h46m with 0 errors on Tue Apr 14 > 00:19:34 2009 > config: > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c1d0s0 ONLINE 0 0 1 > errors: No known data errors > > Since it is a rather new drive and has no trouble with Ubuntu, I dared > to clean it: > # zpool clear rpool > and checked it for errors: > # zpool scrub rpool > # zpool status -v > pool: rpool > state: ONLINE > scrub: scrub completed after 0h47m with 0 errors on Tue Apr 14 > 23:53:48 2009 > config: > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > c1d0s0 ONLINE 0 0 0 > errors: No known data errors > > Now I wonder where that error came from. It was just a single checksum > error. It couldn''t go away with an earlier scrub, and seemingly left > no traces of badness on the drive. Something serious? At least it > looks a tad contradictory: "Applications are unaffected.", it is > unrecoverable, and once cleared, there is no error left.Since there are "no known data errors," it was fixed, and the scrub should succeed without errors. You cannot conclude that the drive is completely free of faults using scrub, you can only test the areas of the drive which have active data. Or, to look at it another way, defects in the disk which can be corrected at the file system level, will be. As Cindy notes, more detailed info is available in FMA. But know that ZFS can detect transient faults, as well as permanent faults, almost anywhere in the data path. -- richard
Bob Friesenhahn
2009-Apr-15 15:33 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
On Wed, 15 Apr 2009, Uwe Dippel wrote:> > Now I wonder where that error came from. It was just a single checksum error. > It couldn''t go away with an earlier scrub, and seemingly left no traces of > badness on the drive. Something serious? At least it looks a tad > contradictory: "Applications are unaffected.", it is unrecoverable, and once > cleared, there is no error left.Since it was not reported that user data was impacted, it seems likely that there was a read failure (or bad checksum) for ZFS metadata which is redundantly stored. It could just as well been file data but you are lucky this time. If you are worried about your individual files, then it might be wise to set copies=2 so that file data is duplicated, but this will consume more space and reduce write performance. It is better to add a mirror disk if you can since the whole disk could fail. Ubuntu Linux is unlikely to notice data problems unless the drive reports hard errors. ZFS is much better at checking for errors. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Uwe Dippel
2009-Apr-15 15:38 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Richard Elling wrote:>> >> status: One or more devices has experienced an unrecoverable error. An >> attempt was made to correct the error. Applications are unaffected. >> NAME STATE READ WRITE CKSUM >> rpool ONLINE 0 0 0 >> c1d0s0 ONLINE 0 0 1 >> errors: No known data errors >> >> # zpool clear rpool >> # zpool status -v >> pool: rpool >> state: ONLINE >> scrub: scrub completed after 0h47m with 0 errors on Tue Apr 14 >> 23:53:48 2009 >> config: >> NAME STATE READ WRITE CKSUM >> rpool ONLINE 0 0 0 >> c1d0s0 ONLINE 0 0 0 >> errors: No known data errors >> >> Now I wonder where that error came from. It was just a single >> checksum error. It couldn''t go away with an earlier scrub, and >> seemingly left no traces of badness on the drive. Something serious? >> At least it looks a tad contradictory: "Applications are >> unaffected.", it is unrecoverable, and once cleared, there is no >> error left. > > Since there are "no known data errors," it was fixed, and the scrub > should succeed without errors. You cannot conclude that the drive > is completely free of faults using scrub, you can only test the areas > of the drive which have active data.I didn''t conclude that. I conclude, when an ''unrecoverable error'' is found, that ''zpool clear'' cannot recover it. Still, there was one one CHSUM error before, and it wouldn''t go away before the ''clear''; while after the ''clear'' even that one would disappear.> > > As Cindy notes, more detailed info is available in FMA. But know > that ZFS can detect transient faults, as well as permanent faults, > almost anywhere in the data path.So this is the respective output: Feb 16 2009 23:18:47.848442332 ereport.io.scsi.cmd.disk.dev.uderr nvlist version: 0 class = ereport.io.scsi.cmd.disk.dev.uderr ena = 0xd0dd396561a00001 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = dev device-path = /pci at 0,0/pci1565,3409 at 4,1/storage at 4/disk at 0,0 devid = id1,sd at f00551e8c4980493b000551a00000 (end detector) driver-assessment = fail op-code = 0x1a cdb = 0x1a 0x0 0x8 0x0 0x18 0x0 pkt-reason = 0x0 pkt-state = 0x1f pkt-stats = 0x0 stat-code = 0x0 un-decode-info = sd_get_write_cache_enabled: Mode Sense caching page code mismatch 0 un-decode-value __ttl = 0x1 __tod = 0x499983d7 0x329233dc Mar 27 2009 22:27:42.314752029 ereport.fs.zfs.checksum nvlist version: 0 class = ereport.fs.zfs.checksum ena = 0xb393a3ba200001 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = zfs pool = 0xf6bd78c1d3b3c878 vdev = 0x38287e797d1642bc (end detector) pool = rpool pool_guid = 0xf6bd78c1d3b3c878 pool_context = 0 pool_failmode = continue vdev_guid = 0x38287e797d1642bc vdev_type = disk vdev_path = /dev/dsk/c2d0s0 vdev_devid = id1,cmdk at AWDC_WD6400AAKS-65A7B0=_____WD-WMASY4847131/a parent_guid = 0xf6bd78c1d3b3c878 parent_type = root zio_err = 50 zio_offset = 0x13a4c00000 zio_size = 0x20000 zio_objset = 0x13f zio_object = 0x20ff4 zio_level = 0 zio_blkid = 0xa __ttl = 0x1 __tod = 0x49cce25e 0x12c2bc1d Apr 13 2009 21:29:35.739718381 ereport.fs.zfs.checksum nvlist version: 0 class = ereport.fs.zfs.checksum ena = 0xb6afed32000001 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = zfs pool = 0xf6bd78c1d3b3c878 vdev = 0x38287e797d1642bc (end detector) pool = rpool pool_guid = 0xf6bd78c1d3b3c878 pool_context = 0 pool_failmode = continue vdev_guid = 0x38287e797d1642bc vdev_type = disk vdev_path = /dev/dsk/c1d0s0 vdev_devid = id1,cmdk at AWDC_WD6400AAKS-65A7B0=_____WD-WMASY4847131/a parent_guid = 0xf6bd78c1d3b3c878 parent_type = root zio_err = 50 zio_offset = 0x421660000 zio_size = 0x20000 zio_objset = 0x107 zio_object = 0x38dbf zio_level = 0 zio_blkid = 0x4 __ttl = 0x1 __tod = 0x49e33e3f 0x2c1734ed # So I had not that many errors in the last 2 months: 3. I''m sorry, but my question remains unanswered: Where did the unrecoverable error come from, and how it could go away? Uwe
Uwe Dippel
2009-Apr-15 15:49 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Bob Friesenhahn wrote:> > Since it was not reported that user data was impacted, it seems likely > that there was a read failure (or bad checksum) for ZFS metadata which > is redundantly stored.(Maybe I am too much of a linguist to not stumble over the wording here.) If it is ''redundant'', it is ''recoverable'', am I right? Why, if this is the case, does scrub not recover it, and scrub even fails to correct the CKSUM error as long as it is flagged ''unrecoverable'', but can do exactly that after the ''clear'' command?> > Ubuntu Linux is unlikely to notice data problems unless the drive > reports hard errors. ZFS is much better at checking for errors.No doubt. But ext3 also seems to need much less attention, very much fewer commands. Which leaves it as a viable alternative. I still hope that one day ZFS will be maintainable as simple as ext3; respectively do all that maintenance on its own. :) Uwe
On Wed, Apr 15, 2009 at 11:49 AM, Uwe Dippel <udippel at gmail.com> wrote:> Bob Friesenhahn wrote: >> >> Since it was not reported that user data was impacted, it seems likely >> that there was a read failure (or bad checksum) for ZFS metadata which is >> redundantly stored. > > (Maybe I am too much of a linguist to not stumble over the wording here.) If > it is ''redundant'', it is ''recoverable'', am I right? Why, if this is the > case, does scrub not recover it, and scrub even fails to correct the CKSUM > error as long as it is flagged ''unrecoverable'', but can do exactly that > after the ''clear'' command? > >> >> Ubuntu Linux is unlikely to notice data problems unless the drive reports >> hard errors. ?ZFS is much better at checking for errors. > > No doubt. But ext3 also seems to need much less attention, very much fewer > commands. Which leaves it as a viable alternative. I still hope that one day > ZFS will be maintainable as simple as ext3; respectively do all that > maintenance on its own. ?:) > > UweYou only need to decide what you want here. Yes, ext3 requires less maintenance, because it can''t tell you if a block becomes corrupt (though fsck-in when that *does* happen can require hours, compared to zfs replacing with a good block from the other half of your mirror). ZFS can *fully* do it''s job only when it has several copies of blocks to choose from. Since you have only one disk here, ZFS can only say ''hey, your checksum for this block is bad - sorry''. ext3 might do the same thing, though only if you tried to use the block with an application that knew what the block was supposed to look like. That said, I think your comments raise a valid point that ZFS could be a little easier for individuals to use. I totally understand why Sun doesn''t focus on end-user management tools (not their market) - on the other hand, the code is out there, so if you see a problem, get some people together to write some management tools! :)
Fajar A. Nugraha
2009-Apr-15 17:05 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
On Wed, Apr 15, 2009 at 10:49 PM, Uwe Dippel <udippel at gmail.com> wrote:> Bob Friesenhahn wrote: >> >> Since it was not reported that user data was impacted, it seems likely >> that there was a read failure (or bad checksum) for ZFS metadata which is >> redundantly stored. > > (Maybe I am too much of a linguist to not stumble over the wording here.) If > it is ''redundant'', it is ''recoverable'', am I right?Looking at the message link http://www.sun.com/msg/ZFS-8000-9P " Description A device has experienced uncorrectable errors in a replicated configuration. " Which means the particular block on that device has "uncorrectable errors" (read failure or bad checksum, as Bob pointed out). Possible due to bad sector. So the "uncorrectable" refer to that particular data block or device. In your case since the error is most likely on zfs metadata (which is automatically stored redundantly), zfs is able to read from the redundant copy and replace the bad metadata. The same thing would also happen if the error is on user data that has redundancy through either : - mirror or raidz vdev, or - copies=2 (or more) Had the error occured on user data, on a non-mirrored non-raidz pool, with copies=1 (the default), you would''ve got http://www.sun.com/msg/ZFS-8000-8A> Why, if this is the > case, does scrub not recover itit does, since on this case the data is stored redundantly.>, and scrub even fails to correct the CKSUM > error as long as it is flagged ''unrecoverable'', but can do exactly that > after the ''clear'' command?"clear" simply resets the error counter back to 0. On your next run, the bad block is probably still unused. Since zfs scrub only checks used blocks, the bad block is not checked. Thus giving the impression "the error has magically go away", when in fact you can re-experience it if the bad block is reused later.> >> >> Ubuntu Linux is unlikely to notice data problems unless the drive reports >> hard errors. ?ZFS is much better at checking for errors. > > No doubt. But ext3 also seems to need much less attention, very much fewer > commands. Which leaves it as a viable alternative. I still hope that one day > ZFS will be maintainable as simple as ext3; respectively do all that > maintenance on its own. ?:) >In a sense zfs already "do all maintenance on its own" the same way ext3 does: - both stored metadata (superblock on ext3) redundantly - both can recover cleanly from unclean shutdown (power failure, etc.) - on both filesystem, had a bad sector occured on non-redundant data, you''ll simply unable to access it. You can mimic the "ignore error" behavior of ext3 somewhat by setting checksum=off. Not recommended, but a usable setting if you already had redundancy at lower level (e.g. hardware raid) and you trust it completely. Regards, Fajar
Richard Elling
2009-Apr-15 17:42 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Uwe Dippel wrote:> Richard Elling wrote: >>> >>> status: One or more devices has experienced an unrecoverable error. An >>> attempt was made to correct the error. Applications are unaffected. >>> NAME STATE READ WRITE CKSUM >>> rpool ONLINE 0 0 0 >>> c1d0s0 ONLINE 0 0 1 >>> errors: No known data errors >>> >>> # zpool clear rpool >>> # zpool status -v >>> pool: rpool >>> state: ONLINE >>> scrub: scrub completed after 0h47m with 0 errors on Tue Apr 14 >>> 23:53:48 2009 >>> config: >>> NAME STATE READ WRITE CKSUM >>> rpool ONLINE 0 0 0 >>> c1d0s0 ONLINE 0 0 0 >>> errors: No known data errors >>> >>> Now I wonder where that error came from. It was just a single >>> checksum error. It couldn''t go away with an earlier scrub, and >>> seemingly left no traces of badness on the drive. Something serious? >>> At least it looks a tad contradictory: "Applications are >>> unaffected.", it is unrecoverable, and once cleared, there is no >>> error left. >> >> Since there are "no known data errors," it was fixed, and the scrub >> should succeed without errors. You cannot conclude that the drive >> is completely free of faults using scrub, you can only test the areas >> of the drive which have active data. > > I didn''t conclude that.Could you propose alternate wording?> I conclude, when an ''unrecoverable error'' is found, that ''zpool clear'' > cannot recover it.ZFS did recover, which is why it says "no known data errors." If the data was not recoverable, then it would show you which file was affected. Perhaps the confusion is the layer which is reporting the bad data? In the fmdump output, there is a ZFS checksum mismatch detected. It is unclear why there is a mismatch because there was not a corresponding error event logged by the disk driver. What ZFS knows is that the data it read did not match the data it wrote. So ZFS repaired the data. Since ZFS is a COW architecture, the repair would involve writing the corrected data elsewhere.> Still, there was one one CHSUM error before, and it wouldn''t go away > before the ''clear''; while after the ''clear'' even that one would > disappear.Clear just resets the counters. -- richard
Jens Elkner
2009-Apr-16 03:14 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
On Wed, Apr 15, 2009 at 10:32:13PM +0800, Uwe Dippel wrote:> status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected....> errors: No known data errors > > Now I wonder where that error came from. It was just a single checksumHmmm, had ~ 2 weeks ago also a curious thing with an StorEdge 3510 (2x2Gbps FC MP, 1 Controller, 2x6HDDs mirrored and exported as a single device, no ZIL etc. tricks) connected to a X4600: Since grill party time has started, the 3510 decided at a room temp of 33?C to go "offline" and take part on the party ;-). Result was that during the offline time everything blocked (i.e. didn''t got timeout or error), which tried to access a ZFS on that pool - wrt. the POV more or less expected. After the 3510 came back, a ''zpool status ..'' showed something like this: NAME STATE READ WRITE CKSUM pool2 FAULTED 289K 4.03M 0 c4t600C0FF000000000099C790E0144EC00d0 FAULTED 289K 4.03M 0 too many errors errors: Permanent errors have been detected in the following files: pool2/home/stud/inf/foobar:<0x0> Still everything was blocking. After a ''zpool clear'' all ZFS ( ~ 2300 on that pool) expect the listed one were accessable, but the status message kept unchanged. Curious, thought that blocking/waiting for the device to come back and the ZFS transaction stuff is actually made for a situation like this, aka "re-commit" un-ACKed actions ... Anyway, finally scrubbing the pool brought it back to normal ONLINE state without any errors. To be sure I compared the ZFS in question with the backup from some hours ago - no difference. So same question made in the subject. BTW: Some days later we had an even bigger grill party (~ 38?C) - this time the X4xxx machines in this room decided to go offline and take part as well (v4xx''s kept running ;-)). So first the 3510 and some time later the X4600. This time the pool was after going back online in DEGRADED state, had some more errors like the above one and: <metadata>:<0x103> <metadata>:<0x4007> ... Clearing and scrubbing it brought it again back to normal ONLINE state without any errors. Spot check on the noted files with errors showed no damage ... Everything nice (wrt. data loss), but curious ... Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
Uwe Dippel
2009-Apr-16 09:38 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
On Thu, Apr 16, 2009 at 1:05 AM, Fajar A. Nugraha <fajar at fajar.net> wrote: [...] Thanks, Fajar, et al. What this thread actually shows, alas, is that ZFS is rocket science. In 2009, one would expect a file system to ''just work''. Why would anyone want to have to ''status'' it regularly, in case ''scrub'' it, and if scrub doesn''t do the trick (and still not knowing how serious the ''unrecoverable error'' is - like in this case), ''clear'' it, ''scrub'' again, followed by another ''status'', or even a more advanced fmdump -eV to see all hex values in there (and leave it to the interpretation of unknown what those actually are), and hope it will still make it; and in the end getting the suggestion to ''add another disk for RAID''. Serious, guys and girls, I am pretty glad that I still run my servers on OpenBSD (despite all temptations to change to OpenSolaris), where I can ''boot and forget'' about them until a patch requires my action. If I can''t trust the metadata of a pool (which might disappear completely or not, as we had to learn in here), and have to manually do all the tasks further up, or write a script to do that for me (and how shall I do that, if even in here seemingly an unrecoverable error can be recovered and no real explanation is forthcoming), by all means, this is a dead-born project; with all due respect that I as an engineer of 30 years have for you guys. I do guess and believe that ZFS is so much better as filesystem than any other, honestly. But the history of engineering has seen the best items fail, because their advanced features completely bypassed the market-place and its psychologies. Even I myself as an avid and responsible system administrator, I am not sure that I wanted to read 30+ pages of commands and explanations of ZFS-messages and comments: http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide In the end, I don''t feel like reading kernel code neither. Both kernel and file system simply need to do the job. And if they tend to fall over for lack of maintenance (that is manual control and configuration), they are useless in the real world. Yes, some will reiterate that with ZFS I can be sure to have 100% consistent data. That''s all hunky dory. But we here simply cannot afford the huge effort that is seemingly required therefore. And in 99%+ of the cases, a very standard and easily handled FFS/UFS with RAID and backup will just do, as much as I personally feel how great of a step ZFS is in principle. Uwe
Mattias Pantzare
2009-Apr-16 10:53 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
On Thu, Apr 16, 2009 at 11:38, Uwe Dippel <udippel at gmail.com> wrote:> On Thu, Apr 16, 2009 at 1:05 AM, Fajar A. Nugraha <fajar at fajar.net> wrote: > > [...] > > Thanks, Fajar, et al. > > What this thread actually shows, alas, is that ZFS is rocket science. > In 2009, one would expect a file system to ''just work''. Why would > anyone want to have to ''status'' it regularly, in case ''scrub'' it, and > if scrub doesn''t do the trick (and still not knowing how serious the > ''unrecoverable error'' is - like in this case), ''clear'' it, ''scrub''You don not have to status it regularly if you don''t want to. Just as with any other file system. The difference is that you can. Just as you can and should do on your RAID system that you use with any other file system. If you do not have any problems ZFS will just work. If you have problems ZFS will ?how them to you much better than EXT3, FFS, UFS or other traditional filesystem. And often fix them for you. In many cases you would get corrupted data or have to run fsck for the same error on FFS/UFS. Scrub is much nicer than fsck, it is not easy to know the best answer to the questons that fsck will give you if you have a serious metadata problem on FFS/UFS. And yes, you can get into trouble even on OpenBSD. You also have to look at the complexity of your volume manager as ZFS is both a filesystem and volume manager in one.
Casper.Dik at Sun.COM
2009-Apr-16 10:59 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
>If you do not have any problems ZFS will just work. If you have >problems ZFS will =B6how them to you much better than EXT3, FFS, UFS or >other traditional filesystem. And often fix them for you. In many >cases you would get corrupted data or have to run fsck for the same >error on FFS/UFS.As most data is "file data" none of the other filesystems would detect an error. But the file is still corrupted.>Scrub is much nicer than fsck, it is not easy to know the best answer >to the questons that fsck will give you if you have a serious metadata >problem on FFS/UFS. And yes, you can get into trouble even on OpenBSD.Of course, if your memory is bad, you could see a transient error during a scrub. Casper
Bob Friesenhahn
2009-Apr-16 17:26 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
On Thu, 16 Apr 2009, Uwe Dippel wrote:> > What this thread actually shows, alas, is that ZFS is rocket science. > In 2009, one would expect a file system to ''just work''. Why would > anyone want to have to ''status'' it regularly, in case ''scrub'' it, andFor common uses, ZFS is not any more complicated than your ephemeral gmail.com email account but it seems that you have figured that out just fine. Good for you. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Thu, Apr 16, 2009 at 12:26 PM, Bob Friesenhahn < bfriesen at simple.dallas.tx.us> wrote:> On Thu, 16 Apr 2009, Uwe Dippel wrote: > >> >> What this thread actually shows, alas, is that ZFS is rocket science. >> In 2009, one would expect a file system to ''just work''. Why would >> anyone want to have to ''status'' it regularly, in case ''scrub'' it, and >> > > For common uses, ZFS is not any more complicated than your ephemeral > gmail.com email account but it seems that you have figured that out just > fine. Good for you. > > Bob > -- > Bob Friesenhahn > bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ >I can''t say I''ve ever had to translate binary to recover an email from the trash bin with Gmail... which is for "common users". Unless of course you''re suggesting "common users" will never want to recover a file after zfs alerts them it''s corrupted. He''s got a very valid point, and the responses are disheartening at best. Just because other file systems don''t detect the corruption, or require lots of work to recover, does not make it OK for zfs to do the same. Excuses are just that, excuses. He isn''t asking for an excuse, he''s asking for an answer. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090416/648953a1/attachment.html>
Richard Elling
2009-Apr-16 19:41 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Tim wrote:> I can''t say I''ve ever had to translate binary to recover an email from > the trash bin with Gmail... which is for "common users". Unless of > course you''re suggesting "common users" will never want to recover a > file after zfs alerts them it''s corrupted. > > He''s got a very valid point, and the responses are disheartening at > best. Just because other file systems don''t detect the corruption, or > require lots of work to recover, does not make it OK for zfs to do the > same. Excuses are just that, excuses. He isn''t asking for an excuse, > he''s asking for an answer.Excuses? I did sense an issue with terminology and messaging, but there are no excuses here. ZFS detected a problem. The problem did not affect his data, as it was recovered. I''d like to reiterate here that if you can think of a better way to communicate with people, then please file a bug. Changes in messages and docs tend to be much easier than changes in logic. P.S. don''t shoot the canary! -- richard
Florian Ermisch
2009-Apr-16 21:27 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Uwe Dippel schrieb:> Bob Friesenhahn wrote: >> >> Since it was not reported that user data was impacted, it seems likely >> that there was a read failure (or bad checksum) for ZFS metadata which >> is redundantly stored. > > (Maybe I am too much of a linguist to not stumble over the wording > here.) If it is ''redundant'', it is ''recoverable'', am I right? Why, if > this is the case, does scrub not recover it, and scrub even fails to > correct the CKSUM error as long as it is flagged ''unrecoverable'', but > can do exactly that after the ''clear'' command? > >> >> Ubuntu Linux is unlikely to notice data problems unless the drive >> reports hard errors. ZFS is much better at checking for errors. > > No doubt. But ext3 also seems to need much less attention, very much > fewer commands. Which leaves it as a viable alternative. I still hope > that one day ZFS will be maintainable as simple as ext3; respectively do > all that maintenance on its own. :)Ext3 has no (optional) redundancy by using more than one disc and no volume managment. You need Device Mapper for redundancy (Multiple Devices or Linux Volume Management) and volume management (LVM again). If you want such features on Linux Ext3 is the top of at least 2, probably 3 layers of storage managment. Should I add NFS, CIFS and iSCSI exports or the needlessness of resizing volumes? You''re comparing a single tool with a whole production line. Sorry for the flaming but yesterday I spend 4 additional hours at work with recovery of a xen server with a single error somewhere in it''s LVM causing the virtual servers to freeze.> > UweKind Regards, FLorian> > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Drew Balfour
2009-Apr-16 22:15 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
>>>> Now I wonder where that error came from. It was just a single >>>> checksum error. It couldn''t go away with an earlier scrub, and >>>> seemingly left no traces of badness on the drive. Something serious? >>>> At least it looks a tad contradictory: "Applications are >>>> unaffected.", it is unrecoverable, and once cleared, there is no >>>> error left.What happens if you rescrub the pool after clearing the errors? If zfs has reused whatever was causing the issue, then it shouldn''t be surprising that the error will show up again.> Could you propose alternate wording?My $.02, but the wording in the error message is rather obtuse. "Unrecoverable error" indicates to me that something was lost; technically this is true, but zfs was able to replicate the data from another source. This is not all that clear from the error: status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. This doesn''t indicate if the attempt was successful or not. We all know it was, because if it wasn''t, we''d (a) see another error instead and/or (b) see something other than "errors: No known data errors". But, unless you know zfs well enough to make that leap, you''re left wondering what actually happened. Granted, the ''verbose'' error page (http://www.sun.com/msg/ZFS-8000-9P) does a much better job of explaining. However, confusing terse error messages are never good, and asking the user to go look stuff up in order to understand isn''t good either. Also, the verbose error page also doesn''t explain that despite not having a replicated configuration, metadata is replicated and so errors can be recovered from a seemingly ''unrecoverable'' state. Does anyone know why it''s "applications" and not "data"? Perhaps something like: status: One or more devices has experienced an error. A successful attempt to correct the error was made using a replicated copy of the data. Data on the pool is unaffected. -Drew
Toby Thain
2009-Apr-16 23:13 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
On 16-Apr-09, at 5:27 PM, Florian Ermisch wrote:> Uwe Dippel schrieb: >> Bob Friesenhahn wrote: >>> >>> Since it was not reported that user data was impacted, it seems >>> likely that there was a read failure (or bad checksum) for ZFS >>> metadata which is redundantly stored. >> (Maybe I am too much of a linguist to not stumble over the wording >> here.) If it is ''redundant'', it is ''recoverable'', am I right? Why, >> if this is the case, does scrub not recover it, and scrub even >> fails to correct the CKSUM error as long as it is flagged >> ''unrecoverable'', but can do exactly that after the ''clear'' command? >>> >>> Ubuntu Linux is unlikely to notice data problems unless the drive >>> reports hard errors. ZFS is much better at checking for errors. >> No doubt. But ext3 also seems to need much less attention, very >> much fewer commands. Which leaves it as a viable alternative. I >> still hope that one day ZFS will be maintainable as simple as >> ext3; respectively do all that maintenance on its own. :) > Ext3 has no (optional) redundancy by using more than one disc and no > volume managment. You need Device Mapper for redundancy (Multiple > Devices or Linux Volume Management) and volume management (LVM again).And you''ll still be lacking checksumming and self healing. --Toby> If you want such features on Linux Ext3 is the top of at least 2, > probably 3 layers of storage managment. > Should I add NFS, CIFS and iSCSI exports or the needlessness of > resizing > volumes? > > You''re comparing a single tool with a whole production line. > Sorry for the flaming but yesterday I spend 4 additional hours at work > with recovery of a xen server with a single error somewhere in it''s > LVM > causing the virtual servers to freeze. > >> Uwe > > Kind Regards, FLorian >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Uwe Dippel
2009-Apr-16 23:39 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Drew Balfour wrote:> > Does anyone know why it''s "applications" and not "data"? > > Perhaps something like: > > status: One or more devices has experienced an error. A successful > attempt to > correct the error was made using a replicated copy of the data. > Data on the pool is unaffected. >If it was (successful), that would have been something. It wasn''t. ''status'' brought up the ''unrecoverable error'', whatever number of ''scrub''s I did. Toby: ''self-healing'' is fine, but that message simply sounds scary, and worse: it doesn''t propose any further sort of action and its consequences. "Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. " This does sound scary, at least to me. How to ''determine if the device needs to be replaced''? Should I ''clear'' or ''replace''? In the end, it needed a ''clear'' and that one CKSUM error went away. As it seems without further consequences and a fully sane disk. Don''t call that ''self-healing''. This is an arcane method demanding plenty of user activity, interaction, reading-up, etc. Yes, Richard, you are correct, linguistically. There was an unrecoverable error in a layer not affecting the layer containing the data. Telling ZFS to replace some metadata with correct ones resolve the - probably - non-existent problem. This reminds me of vfat, with its mirror-FAT. Wouild I want to read about an ''unrecoverable error'' when the mirror is needed? probably not. And if, then I wouldn''t want to have to type ''clear''. And surely I wouldn''t want to wait until I typed ''status'' until I am made aware of the existence of an unrecoverable error, would I! It seems most in here don''t run production servers. A term like ''unrecoverable'' sends me into a state of frenzy. It sounds like my systems are dying any minute. From what I read, it is harmless. Some redundant metadata could not be retrieved. If this was the case, Toby, I wouldn''t want to have to type anything. I''d rather have the system detecting the situation on its own accord, trying the redundant metadata (we do have snapshots, don''t we!), and scrub on its very own. At the end, a mail to root would be in order, informing me that an error has been corrected and no data compromised at all. Thank you, ZFS! That''s what I''d call ''self-healing'' and 21-st century. Uwe
Bob Friesenhahn
2009-Apr-17 00:38 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
On Fri, 17 Apr 2009, Uwe Dippel wrote:> It seems most in here don''t run production servers. A term like > ''unrecoverable'' sends me into a state of frenzy. It sounds like my systems > are dying any minute. From what I read, it is harmless. Some redundantWhile your system is still running and user data has not been compromised, the issue is not necessary harmless since it may be that your hard drive is on a path to failure. Continuing data loss usually indicates a failing hard drive. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Robert Milkowski
2009-Apr-17 00:54 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Hello Uwe, Thursday, April 16, 2009, 10:38:00 AM, you wrote: UD> On Thu, Apr 16, 2009 at 1:05 AM, Fajar A. Nugraha <fajar at fajar.net> wrote: UD> [...] UD> Thanks, Fajar, et al. UD> What this thread actually shows, alas, is that ZFS is rocket science. UD> In 2009, one would expect a file system to ''just work''. Why would UD> anyone want to have to ''status'' it regularly, in case ''scrub'' it, and UD> if scrub doesn''t do the trick (and still not knowing how serious the UD> ''unrecoverable error'' is - like in this case), ''clear'' it, ''scrub'' UD> again, followed by another ''status'', or even a more advanced fmdump UD> -eV to see all hex values in there (and leave it to the interpretation UD> of unknown what those actually are), and hope it will still make it; UD> and in the end getting the suggestion to ''add another disk for RAID''. UD> Serious, guys and girls, I am pretty glad that I still run my servers UD> on OpenBSD (despite all temptations to change to OpenSolaris), where I UD> can ''boot and forget'' about them until a patch requires my action. If UD> I can''t trust the metadata of a pool (which might disappear completely UD> or not, as we had to learn in here), and have to manually do all the UD> tasks further up, or write a script to do that for me (and how shall I UD> do that, if even in here seemingly an unrecoverable error can be UD> recovered and no real explanation is forthcoming), by all means, this UD> is a dead-born project; with all due respect that I as an engineer of With all due respect but you don''t understand how zfs works. With your ext3 or whatever you use on OpenBSD if your system will end-up with some corrupt data being returned from one of a disks in a mirror you will get: - some of your data silently corrupted, and/or - file system will require fsck but it won''t fix user data if affected, and/or - os will panic, and/or - you loose more or all your data in a file system With zfs in such a case everything will work fine and all application will get *PROPER* data and corrupted block will be automatically fixed. That''s what happened to you. You don''t have to do anything and it will just work. Now, zfs wil not only returned proper data to your applications and fixed a corrupted block but it also reported it to you via zpool status output. You can do ''zpool clear'' in order to acknowledge that above has happened or you can leave it as it is, other than it being an information of the above case you don''t have to do anything. In summary - if you want to put it live and forget entirely, fine, do it and it will work as expected and in cases of some data being returned from one disk in a mirror it will be automatically fixed and proper data will be returned. While on your OpenBSD there will be serious consequences if one of disks returned bad data. I don''t understand why you''re complaining about zfs reporting to you that you might have an issue - you do not need to read the report or do anything if you don''t want to, or if you really value your data you might investigate what''s going on until it is too late, while in a mean time zfs provides your applications with correct data. -- Best regards, Robert Milkowski http://milek.blogspot.com
Robert Milkowski
2009-Apr-17 00:58 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Hello Richard, Thursday, April 16, 2009, 8:41:53 PM, you wrote: RE> Tim wrote:>> I can''t say I''ve ever had to translate binary to recover an email from >> the trash bin with Gmail... which is for "common users". Unless of >> course you''re suggesting "common users" will never want to recover a >> file after zfs alerts them it''s corrupted. >> >> He''s got a very valid point, and the responses are disheartening at >> best. Just because other file systems don''t detect the corruption, or >> require lots of work to recover, does not make it OK for zfs to do the >> same. Excuses are just that, excuses. He isn''t asking for an excuse, >> he''s asking for an answer.RE> Excuses? I did sense an issue with terminology and messaging, but RE> there are no excuses here. ZFS detected a problem. The problem did RE> not affect his data, as it was recovered. RE> I''d like to reiterate here that if you can think of a better way to RE> communicate with people, then please file a bug. Changes in RE> messages and docs tend to be much easier than changes in logic. RE> P.S. don''t shoot the canary! I suspect that Uwe thought that unless he do ''zpool clear'' there was something wrong and that it is required to do so. Well, actually not - it''s only an information that corruption happened but thanks to redundancy, checksums and zfs applications got *correct* data and corrupted data was fixed. zpool clear is only to "reset" statistics os such errors and nothing more, and one doesn''t even have to bother checking for it if that someone don''t care about being pro-active to possible future failure. -- Best regards, Robert Milkowski http://milek.blogspot.com
Robert Milkowski
2009-Apr-17 01:08 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Hello Blake, Wednesday, April 15, 2009, 5:18:19 PM, you wrote: B> You only need to decide what you want here. Yes, ext3 requires less B> maintenance, because it can''t tell you if a block becomes corrupt B> (though fsck-in when that *does* happen can require hours, compared to B> zfs replacing with a good block from the other half of your mirror). I can''t agree that ext3 requires less maintenance, actually it is quite the opposite. If everything is fine, there is no data corruption then you don''t have to do anything on both file systems. But when corruption happens on one side of a mirror you still don''t have to do anything in zfs case and your data returned to applications will still be correct while corrupted data on a disk will be automatically repaired. Now if you really value your data you probably want to monitor if such correctable by zfs events happen and investigate further to prevent eventual failure - but you don''t have to. In ext3 case if one side of a mirror returns corrupted data you will end-up with applications getting BAD data and/or will have to fsck filesystem and/or will loose some or all data and/or OS will panic, etc. Then if you do want to investigate then on Open Solaris platform thanks to zfs, fma and other tools you''ve actualy got some chance to nail down the underlying issue while on Linux with ext3 you end-up blaming unidentified bugs (well, one might argue that lack of data consistency checking and repair in fs is a bug...) in you file system or at least your toolset to find what''s going on is somewhat limited to what Open Solaris has to offer. -- Best regards, Robert Milkowski http://milek.blogspot.com
Robert Milkowski
2009-Apr-17 01:15 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Hello Uwe, Friday, April 17, 2009, 12:39:13 AM, you wrote: UD> Drew Balfour wrote:>> >> Does anyone know why it''s "applications" and not "data"? >> >> Perhaps something like: >> >> status: One or more devices has experienced an error. A successful >> attempt to >> correct the error was made using a replicated copy of the data. >> Data on the pool is unaffected. >>UD> If it was (successful), that would have been something. It wasn''t. UD> ''status'' brought up the ''unrecoverable error'', whatever number of UD> ''scrub''s I did. Toby: ''self-healing'' is fine, but that message simply UD> sounds scary, and worse: it doesn''t propose any further sort of action UD> and its consequences. And it was *something* as it did. When you read the message carefully you will see that it says that "Applications are unaffected" and you don''t have to do anything. You can investigate if you want to but you don''t have to. Now zpool scrub will read all used data and verify it against checksum, correct if required and report new stats on error if needed. It won''t clear error statistics. If you want to clear them then use ''zpool clear'' as you did. -- Best regards, Robert Milkowski http://milek.blogspot.com
Drew Balfour
2009-Apr-17 02:57 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Uwe Dippel wrote:> If it was (successful), that would have been something. It wasn''t.It was; zfs successfully repaired the data, as is evidenced by the lack of errors in the status output: errors: No known data errors> ''status'' brought up the ''unrecoverable error'', whatever number of > ''scrub''s I did.Hence the misunderstanding. The scrub is telling you, rather confusingly, that the device has an error, but zfs has managed to work around this error and maintain data integrity. The scrub will not ''fix'' the error, as zfs can''t fix, say, a bad block on your disk drive. It will, however, maintain data integrity if possible. See below for an example of what I''m trying to convey.> "Determine if the device needs to be replaced, and clear the errors > using ''zpool clear'' or replace the device with ''zpool replace''. " > This does sound scary, at least to me. How to ''determine if the device > needs to be replaced''? > Should I ''clear'' or ''replace''?It depends on what caused the error. For example, if I have a mirrored pool and accidentally format one side of the mirror, zpool status will show you the errors and leave it up to you. For example: # zpool create swim mirror c4t1d0s0 c4t1d0s1 # zpool status pool: swim state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM swim ONLINE 0 0 0 mirror ONLINE 0 0 0 c4t1d0s0 ONLINE 0 0 0 c4t1d0s1 ONLINE 0 0 0 errors: No known data errors # dd if=/dev/zero of=/dev/dsk/c4t1d0s0 bs=1024x1024 skip=5 count=50 50+0 records in 50+0 records out !!oh no, I just zero''d out half of one of my mirror devices!! # zpool scrub swim # zpool status pool: swim state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: scrub completed after 0h0m with 0 errors on Thu Apr 16 18:52:28 2009 config: NAME STATE READ WRITE CKSUM swim DEGRADED 0 0 0 mirror DEGRADED 0 0 0 c4t1d0s0 DEGRADED 0 0 87 too many errors c4t1d0s1 ONLINE 0 0 0 errors: No known data errors !!Since I didn''t actually have any data on the pool, the only errors were !!metadata checksum errors. The confusion here is that in the above output, "error" has different meanings depending on its context. "One or more devices has experienced an unrecoverable error." In this context, "error" refers to zfs reading data off the disk, and finding that the checksum doesn''t match (or in this case, actually exist). zfs has no idea why the checksum doesn''t match; it could be a drive error, a driver error, a user caused error, bad bits on the bus, whatever. zfs cannot correct these errors, any more than any software can fix any hardware error. We do know that whatever the error was, we didn''t get an associated "I/O Error" from the drive, as that column is zero. So the drive doesn''t even know there''s an error! "An attempt was made to correct the error." In this context, "error" refers to the actual bad checksum. zfs can fix this. In this case, by either reading from the other side of the mirror or from the replicated metadata. It should be noted that this attempt was successful, as zfs was able to maintain data integrity. The is implied in the error, confusingly. "scrub: scrub completed after 0h0m with 0 errors on Thu Apr 16 18:52:28 2009" "errors: No known data errors" In this context, "error" refers to uncorrectable, unrecoverable data corruption. There is a problem with your data, and zfs was unable to fix it. In this case, there were none of these, which is a good thing. Now, as to whether to replace or clear... In this particular case, I know what caused the error. Me. I know the disk is fine. I can simply: # zpool clear swim # zpool status pool: swim state: ONLINE scrub: scrub completed after 0h0m with 0 errors on Thu Apr 16 18:52:28 2009 config: NAME STATE READ WRITE CKSUM swim ONLINE 0 0 0 mirror ONLINE 0 0 0 c4t1d0s0 ONLINE 0 0 0 c4t1d0s1 ONLINE 0 0 0 errors: No known data errors zfs clear simply zeros the device error counters. I know there was nothing wrong with the device, so I can forget about those errors. If I didn''t know the cause of the error, and suspected a bad disk, I''d probably choose to replace the device.> In the end, it needed a ''clear'' and that one CKSUM error went away. As > it seems without further consequences and a fully sane disk. > Don''t call that ''self-healing''. This is an arcane method demanding > plenty of user activity, interaction, reading-up, etc.zfs clear will _always_ clear _all_ errors. It''s a sysadmin''s choice to clear the error counters. You don''t have to clear the errors; if you''d rather keep track of all of the errors over the lifetime of the pool, go right ahead. # zpool status | egrep "errors: |c4t1d0s0" c4t1d0s0 ONLINE 0 0 0 errors: No known data errors # dd if=/dev/zero of=/dev/dsk/c4t1d0s0 bs=1024x1024 skip=5 count=50 # zpool scrub swim # zpool status | egrep "errors: |c4t1d0s0" c4t1d0s0 DEGRADED 0 0 652 too many errors errors: No known data errors # dd if=/dev/zero of=/dev/dsk/c4t1d0s0 bs=1024x1024 skip=5 count=50 # zpool scrub swim # zpool status | egrep "errors: |c4t1d0s0" c4t1d0s0 DEGRADED 0 0 1.27K too many errors errors: No known data errors You can zpool clear at any time, or you can never do it. Of course, if you don''t know the cause of the errors, clearing probably isn''t the best course of action, if you value your data. Replacing the device will also reset the counters, obviously, as the old device is removed and the new device (hopefully) has no problems: # zpool status| grep c4t1d0s0 c4t1d0s0 DEGRADED 0 0 84 too many errors # zpool replace swim c4t1d0s0 c4t1d0s3 # zpool status | grep c4t1d0s3 c4t1d0s3 ONLINE 0 0 0 83.5K resilvered> It seems most in here don''t run production servers. A term like > ''unrecoverable'' sends me into a state of frenzy.Personally, I agree. I think the wording of the current message is confusing at best, and panic inducing at worst.> If this was the case, Toby, I wouldn''t want to have to type anything. I''d rather > have the system > detecting the situation on its own accord, trying the redundant metadata > (we do have snapshots, don''t we!), and scrub on its very own. At the > end, a mail to root would be in order, informing me that an error has > been corrected and no data compromised at all.That''s actually exactly what happened, minus the email. In your case, and in all the examples above, the "zpool scrub" is entirely unnecessary. I ran it in the examples to force zfs to examine the pool and find the errors. If I''d left it alone, and done things to the file system, it would have found the errors and dealt with them as the data was accessed. In other words, I could have done: !!put some data on the pool: # dd if=/dev/urandom of=/swim/a bs=1024x1024 count=60 60+0 records in 60+0 records out !!do something foolish #dd if=/dev/zero of=/dev/dsk/c4t1d0s0 bs=1024x1024 skip=5 count=50 50+0 records in 50+0 records out !!use the data on the pool # dd if=/swim/a of=/b bs=1024x1024 60+0 records in 60+0 records out # zpool status pool: swim state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAME STATE READ WRITE CKSUM swim ONLINE 0 0 0 mirror ONLINE 0 0 0 c4t1d0s0 ONLINE 0 0 14 c4t1d0s1 ONLINE 0 0 0 errors: No known data errors Now, if you''d like zfs to email you when it finds errors, that''s easy enough to do, since zfs helpfully logs failures with the fma daemon. By default that dumps to /var/adm/messages, but sending an email to root, or paging you would be trivial to implement: Apr 16 19:28:53 pcandle3 fmd: [ID 441519 daemon.error] SUNW-MSG-ID: ZFS-8000-GH, TYPE: Fault, VER: 1, SEVERITY: Major Apr 16 19:28:53 pcandle3 EVENT-TIME: Thu Apr 16 19:28:53 PDT 2009 Apr 16 19:28:53 pcandle3 PLATFORM: Sun Fire X4200 M2, CSN: 0718BD03B4 , HOSTNAME: pcandle3 Apr 16 19:28:53 pcandle3 SOURCE: zfs-diagnosis, REV: 1.0 Apr 16 19:28:53 pcandle3 EVENT-ID: cd6fe5bc-9137-c32a-c811-ba98dac5dbe9 Apr 16 19:28:53 pcandle3 DESC: The number of checksum errors associated with a ZFS device Apr 16 19:28:53 pcandle3 exceeded acceptable levels. Refer to http://sun.com/msg/ZFS-8000-GH for more information. Apr 16 19:28:53 pcandle3 AUTO-RESPONSE: The device has been marked as degraded. An attempt Apr 16 19:28:53 pcandle3 will be made to activate a hot spare if available. Apr 16 19:28:53 pcandle3 IMPACT: Fault tolerance of the pool may be compromised. Apr 16 19:28:53 pcandle3 REC-ACTION: Run ''zpool status -x'' and replace the bad device. However, I think we can all agree that _not_ telling you that there were problems is not a good idea. I think the argument against automatically scrubbing the entire pool is that scrubs are very I/O intensive, and that could negatively impact performance. Assuming the pool is redundantly configured, there''s no danger of losing data, and any bad data or checksums will be corrected on-the-fly. Of course, if it were my system and I got random, unexplained checksum errors, I''d probably scrub the pool, performance be damned. -Drew
Richard Elling
2009-Apr-17 17:25 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Drew Balfour wrote:>>>>> Now I wonder where that error came from. It was just a single >>>>> checksum error. It couldn''t go away with an earlier scrub, and >>>>> seemingly left no traces of badness on the drive. Something >>>>> serious? At least it looks a tad contradictory: "Applications are >>>>> unaffected.", it is unrecoverable, and once cleared, there is no >>>>> error left. > > What happens if you rescrub the pool after clearing the errors? If zfs > has reused whatever was causing the issue, then it shouldn''t be > surprising that the error will show up again.Are you assuming that bad disk blocks are returned to the free pool? This is more of a problem for file systems with pre-allocated metadata, such as UFS. In UFS, if a sector in a superblock copy goes bad, it will still be reused. In ZFS, metadata is COW and redundant, so there is no forced re-use of disk blocks (except for the uberblocks which are 4x redundant and use 128-slot circular queues).> >> Could you propose alternate wording? > > My $.02, but the wording in the error message is rather obtuse. > "Unrecoverable error" indicates to me that something was lost; > technically this is true, but zfs was able to replicate the data from > another source. This is not all that clear from the error: > > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are > unaffected. > > This doesn''t indicate if the attempt was successful or not. We all > know it was, because if it wasn''t, we''d > (a) see another error instead and/or (b) see something other than > "errors: No known data errors". But, unless you know zfs well enough > to make that leap, you''re left wondering what actually happened. > > Granted, the ''verbose'' error page (http://www.sun.com/msg/ZFS-8000-9P) > does a much better job of explaining. However, confusing terse error > messages are never good, and asking the user to go look stuff up in > order to understand isn''t good either. Also, the verbose error page > also doesn''t explain that despite not having a replicated > configuration, metadata is replicated and so errors can be recovered > from a seemingly ''unrecoverable'' state. > > Does anyone know why it''s "applications" and not "data"? > > Perhaps something like: > > status: One or more devices has experienced an error. A successful > attempt to > correct the error was made using a replicated copy of the data. > Data on the pool is unaffected.I think this is on the right track. But the repair method, "replicated copy of the data," should be more vague because there are other ways to repair data. Does anyone else have better wording? -- richard
On Fri, Apr 17, 2009 at 12:25 PM, Richard Elling <richard.elling at gmail.com>wrote:> Drew Balfour wrote: > >> Now I wonder where that error came from. It was just a single checksum >>>>>> error. It couldn''t go away with an earlier scrub, and seemingly left no >>>>>> traces of badness on the drive. Something serious? At least it looks a tad >>>>>> contradictory: "Applications are unaffected.", it is unrecoverable, and once >>>>>> cleared, there is no error left. >>>>>> >>>>> >> What happens if you rescrub the pool after clearing the errors? If zfs has >> reused whatever was causing the issue, then it shouldn''t be surprising that >> the error will show up again. >> > > Are you assuming that bad disk blocks are returned to the free pool? > This is more of a problem for file systems with pre-allocated metadata, > such as UFS. In UFS, if a sector in a superblock copy goes bad, it > will still be reused. In ZFS, metadata is COW and redundant, so > there is no forced re-use of disk blocks (except for the uberblocks > which are 4x redundant and use 128-slot circular queues). > > >> Could you propose alternate wording? >>> >> >> My $.02, but the wording in the error message is rather obtuse. >> "Unrecoverable error" indicates to me that something was lost; technically >> this is true, but zfs was able to replicate the data from another source. >> This is not all that clear from the error: >> >> status: One or more devices has experienced an unrecoverable error. An >> attempt was made to correct the error. Applications are >> unaffected. >> >> This doesn''t indicate if the attempt was successful or not. We all know it >> was, because if it wasn''t, we''d >> (a) see another error instead and/or (b) see something other than "errors: >> No known data errors". But, unless you know zfs well enough to make that >> leap, you''re left wondering what actually happened. >> >> Granted, the ''verbose'' error page (http://www.sun.com/msg/ZFS-8000-9P) >> does a much better job of explaining. However, confusing terse error >> messages are never good, and asking the user to go look stuff up in order to >> understand isn''t good either. Also, the verbose error page also doesn''t >> explain that despite not having a replicated configuration, metadata is >> replicated and so errors can be recovered from a seemingly ''unrecoverable'' >> state. >> >> Does anyone know why it''s "applications" and not "data"? >> >> Perhaps something like: >> >> status: One or more devices has experienced an error. A successful attempt >> to >> correct the error was made using a replicated copy of the data. >> Data on the pool is unaffected. >> > > I think this is on the right track. But the repair method, "replicated > copy > of the data," should be more vague because there are other ways to > repair data. > > Does anyone else have better wording? > -- richard > >Unless you want to have a different response for each of the repair methods, I''d just drop that part: status: One or more devices has experienced an error. The error has been automatically corrected by zfs. Data on the pool is unaffected. I suppose you could do a "for more information please contact Sun" or something along those lines as well? --Tim (my reply to all skills have been suffering lately, sorry Richard). -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090417/1e9b1f97/attachment.html>
Carson Gaspar
2009-Apr-17 17:45 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
Tim wrote (although it wasn''t his error originally):> Unless you want to have a different response for each of the repair > methods, I''d just drop that part: > > status: One or more devices has experienced an error. The error has been > automatically corrected by zfs. > > Data on the pool is unaffected."Data on the pool are unaffected." Data is plural. Aside from grammar police work, I also agree that this is a better error message to present to the user. (If we''re going to change it, I''d appreciate the new version being one that doesn''t have a detrimental effect on my dental work due to teeth grinding... ;-) -- Carson
Drew Balfour
2009-Apr-17 17:52 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
> Are you assuming that bad disk blocks are returned to the free pool?Hrm. I was assuming that zfs was unaware of the source of the error, and therefore unable to avoid running into it again. If it was a bad sector, and the disk knows about it, then you probably woulnd''t see it again. But if the disk thinks the sector is good, but it''s flipping bits, will zfs prevent the disk from reusing that sector? -Drew
Carson Gaspar wrote:> Tim wrote (although it wasn''t his error originally): > >> Unless you want to have a different response for each of the repair >> methods, I''d just drop that part: >> >> status: One or more devices has experienced an error. The error has been >> automatically corrected by zfs. >> >> Data on the pool is unaffected. > > "Data on the pool are unaffected." Data is plural. >Not to nitpick, but I think most people would prefer the singular ''data'' when referring to the storage of data. The plural ''data'' in this case is very awkward.
Bob Friesenhahn
2009-Apr-17 18:17 UTC
[zfs-discuss] How recoverable is an ''unrecoverable error''?
On Fri, 17 Apr 2009, Dave wrote:> > Not to nitpick, but I think most people would prefer the singular ''data'' when > referring to the storage of data. The plural ''data'' in this case is very > awkward.Assuming that what is stored can be classified as data! http://en.wikipedia.org/wiki/Data Why do we call these collections of bits "data"? Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Fri, Apr 17, 2009 at 1:17 PM, Bob Friesenhahn < bfriesen at simple.dallas.tx.us> wrote:> On Fri, 17 Apr 2009, Dave wrote: > >> >> Not to nitpick, but I think most people would prefer the singular ''data'' >> when referring to the storage of data. The plural ''data'' in this case is >> very awkward. >> > > Assuming that what is stored can be classified as data! > > http://en.wikipedia.org/wiki/Data > > Why do we call these collections of bits "data"? > > Bob > -- > Bob Friesenhahn > bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ >Because the CXX wouldn''t have a frigging clue what you were talking about if you started referencing collections of bits? "We need to buy this $50,000 storage array to store our collection of bits" would likely get you escorted out of his/her office. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090417/cc418078/attachment.html>