Hello List, I recently got bitten by a "panic on `zpool import`" problem (same CR 6915314), while testing a ZFS file server. Seems the pool is pretty much gone, did try - zfs:zfs_recover=1 and aok=1 in /etc/system - `zpool import -fF -o ro` to no avail. I don''t think I will be taking the time trying to fix it unless someone has good ideas. I suspect bad data was written to the pool and seems there is no way to recover; fmdump shows problem with same block on all disks IIRC. The server hardware is pretty ghetto with whitebox components such as non-ECC RAM (cause of the pool loss). I know the hardware sucks but sometimes non-technical people don''t understand the value of data before it is lost.. I was lucky the system had not been sent out yet and the project was "simply" delayed. In light of this experience, I would say raidz is not useful in certain hardware failure scenarios. Bad bit in the RAM at the wrong time and the whole pool is lost. Does the list have any ideas on how to make this kind of ghetto system more resilient (short of buy ECC RAM and mobo for it)? I was thinking something like this: - pool1: raidz pool for the bulk data - pool2: mirror pool for backing up the raidz pool, only imported when the copying pool1 to pool2 What would be the most reliable way to copy the data from pool1 to pool2 keeping in mind "bad bit in RAM and everything is lost"? I worry most about corrupting the pool2 also if pool1 has gone bad or there is a similar hardware failure again. Or is this whole idea just added complexity with no real benefit? Regards, Ville
Edward Ned Harvey
2010-Nov-15 15:21 UTC
[zfs-discuss] Ideas for ghetto file server data reliability?
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of VO > > The server hardware is pretty ghetto with whitebox components such as > non-ECC RAM (cause of the pool loss). I know the hardware sucks but > sometimes non-technical people don''t understand the value of data before > it is lost.. I was lucky the system had not been sent out yet and theproject> was "simply" delayed. > > In light of this experience, I would say raidz is not useful in certainhardware> failure scenarios. Bad bit in the RAM at the wrong time and the whole poolis> lost. > > Does the list have any ideas on how to make this kind of ghetto systemmore> resilient (short of buy ECC RAM and mobo for it)?Backups. Even if you upgrade your hardware to better stuff... with ECC and so on ... There is no substitute for backups. Period. If you care about your data, you will do backups. Period.
Bryan Horstmann-Allen
2010-Nov-15 15:32 UTC
[zfs-discuss] Ideas for ghetto file server data reliability?
+------------------------------------------------------------------------------ | On 2010-11-15 10:21:06, Edward Ned Harvey wrote: | | Backups. | | Even if you upgrade your hardware to better stuff... with ECC and so on ... | There is no substitute for backups. Period. If you care about your data, | you will do backups. Period. Backups are not going to save you from bad memory writing corrupted data to disk. If your RAM flips a bit and writes garbage to disk, and you back up that garbage, guess what: Your backups are full of garbage. Invest in ECC RAM and hardware that is, at the least, less likely to screw you. Test your backups to ensure you can trust them. -- bdha cyberpunk is dead. long live cyberpunk.
Chad Leigh -- Shire.Net LLC
2010-Nov-15 16:25 UTC
[zfs-discuss] Ideas for ghetto file server data reliability?
On Nov 15, 2010, at 8:32 AM, Bryan Horstmann-Allen wrote:> +------------------------------------------------------------------------------ > | On 2010-11-15 10:21:06, Edward Ned Harvey wrote: > | > | Backups. > | > | Even if you upgrade your hardware to better stuff... with ECC and so on ... > | There is no substitute for backups. Period. If you care about your data, > | you will do backups. Period. > > Backups are not going to save you from bad memory writing corrupted data to > disk. > > If your RAM flips a bit and writes garbage to disk, and you back up that > garbage, guess what: Your backups are full of garbage. > > Invest in ECC RAM and hardware that is, at the least, less likely to screw you. > > Test your backups to ensure you can trust them.The amount of resources invested trying to fix this by someone is probably more than the costs of some ECC RAM and a MB
Toby Thain
2010-Nov-15 16:27 UTC
[zfs-discuss] Ideas for ghetto file server data reliability?
On 15/11/10 10:32 AM, Bryan Horstmann-Allen wrote:> +------------------------------------------------------------------------------ > | On 2010-11-15 10:21:06, Edward Ned Harvey wrote: > | > | Backups. > | > | Even if you upgrade your hardware to better stuff... with ECC and so on ... > | There is no substitute for backups. Period. If you care about your data, > | you will do backups. Period. > > Backups are not going to save you from bad memory writing corrupted data to > disk.It is, however, a major motive for using ZFS in the first place. --Toby> > If your RAM flips a bit and writes garbage to disk, and you back up that > garbage, guess what: Your backups are full of garbage. > > Invest in ECC RAM and hardware that is, at the least, less likely to screw you. > > Test your backups to ensure you can trust them.
Sigbjorn Lie
2010-Nov-15 23:08 UTC
[zfs-discuss] Ideas for ghetto file server data reliability?
Do you need registered ECC, or will non-reg ECC do to get around this issue you described? On Mon, 2010-11-15 at 16:48 +0700, VO wrote:> Hello List, > > I recently got bitten by a "panic on `zpool import`" problem (same CR > 6915314), while testing a ZFS file server. Seems the pool is pretty much > gone, did try > - zfs:zfs_recover=1 and aok=1 in /etc/system > - `zpool import -fF -o ro` > to no avail. I don''t think I will be taking the time trying to fix it unless > someone has good ideas. I suspect bad data was written to the pool and seems > there is no way to recover; fmdump shows problem with same block on all > disks IIRC. > > The server hardware is pretty ghetto with whitebox components such as > non-ECC RAM (cause of the pool loss). I know the hardware sucks but > sometimes non-technical people don''t understand the value of data before it > is lost.. I was lucky the system had not been sent out yet and the project > was "simply" delayed. > > In light of this experience, I would say raidz is not useful in certain > hardware failure scenarios. Bad bit in the RAM at the wrong time and the > whole pool is lost. > > Does the list have any ideas on how to make this kind of ghetto system more > resilient (short of buy ECC RAM and mobo for it)? > > I was thinking something like this: > - pool1: raidz pool for the bulk data > - pool2: mirror pool for backing up the raidz pool, only imported when the > copying pool1 to pool2 > > What would be the most reliable way to copy the data from pool1 to pool2 > keeping in mind "bad bit in RAM and everything is lost"? I worry most about > corrupting the pool2 also if pool1 has gone bad or there is a similar > hardware failure again. Or is this whole idea just added complexity with no > real benefit? > > > Regards, > > Ville > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Bryan Horstmann-Allen
2010-Nov-16 00:54 UTC
[zfs-discuss] Ideas for ghetto file server data reliability?
+------------------------------------------------------------------------------ | On 2010-11-15 11:27:02, Toby Thain wrote: | | > Backups are not going to save you from bad memory writing corrupted data to | > disk. | | It is, however, a major motive for using ZFS in the first place. In this context, not trusting your disks is the motive. If corruption (even against metadata) happens in-memory, ZFS will happily write it to disk. Has this behavior changed in the last 6 months? -- bdha cyberpunk is dead. long live cyberpunk.
Toby Thain
2010-Nov-16 02:16 UTC
[zfs-discuss] Ideas for ghetto file server data reliability?
On 15/11/10 7:54 PM, Bryan Horstmann-Allen wrote:> +------------------------------------------------------------------------------ > | On 2010-11-15 11:27:02, Toby Thain wrote: > | > | > Backups are not going to save you from bad memory writing corrupted data to > | > disk. > | > | It is, however, a major motive for using ZFS in the first place. > > In this context, not trusting your disks is the motive. If corruption (even > against metadata) happens in-memory, ZFS will happily write it to disk.The corruption will at least be detected by a scrub, even in cases where it cannot be repaired. --Toby> Has > this behavior changed in the last 6 months?
Edward Ned Harvey
2010-Nov-16 02:28 UTC
[zfs-discuss] Ideas for ghetto file server data reliability?
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Toby Thain > > The corruption will at least be detected by a scrub, even in cases whereit> cannot be repaired.Not necessarily. Let''s suppose you have some bad memory, and no ECC. Your application does 1 + 1 = 3. Then your application writes the answer to a file. Without ECC, the corruption happened in memory and went undetected. Then the corruption was written to file, with a correct checksum. So in fact it''s not filesystem corruption, and ZFS will correctly mark the filesystem as clean and free of checksum errors. In conclusion: Use ECC if you care about your data. Do backups if you care about your data. Don''t be a cheapskate, or else, don''t complain when you get bitten by lack of adequate data protection.
Toby Thain
2010-Nov-16 02:54 UTC
[zfs-discuss] Ideas for ghetto file server data reliability?
On 15/11/10 9:28 PM, Edward Ned Harvey wrote:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- >> bounces at opensolaris.org] On Behalf Of Toby Thain >> >> The corruption will at least be detected by a scrub, even in cases where > it >> cannot be repaired. > > Not necessarily. Let''s suppose you have some bad memory, and no ECC. Your > application does 1 + 1 = 3. Then your application writes the answer to a > file. Without ECC, the corruption happened in memory and went undetected. > Then the corruption was written to file, with a correct checksum. So in > fact it''s not filesystem corruption, and ZFS will correctly mark the > filesystem as clean and free of checksum errors. >I meant corruption after the point at which the application passes its buffer to zfs. But you are right, the checksum could conceivably be correct in this case as well.> In conclusion: > > Use ECC if you care about your data. > Do backups if you care about your data. >Yes. Especially the latter :) --Toby> Don''t be a cheapskate, or else, don''t complain when you get bitten by lack > of adequate data protection. > >
Sriram Narayanan
2010-Nov-16 03:09 UTC
[zfs-discuss] Ideas for ghetto file server data reliability?
To add: Even if you have great faith in ZFS, a backup helps in dealing with the unknown. Consider: - multiple disk failures that you are somehow unable to respond to. - hardware failures (power supplies, motherboard, RAM). - damage to the building. - having to recreate everything elsewhere - even another system - for a special reason. ECC RAM will help ensure that the data given to ZFS is error free. ZFS will ensure that it''s able to detect errors while writing to the storage medium. There are still issues such as disks reporting that data has been written, but not having written it yet. Could someone elaborate a bit more on this aspect, please? -- Sriram On 11/16/10, Edward Ned Harvey <shill at nedharvey.com> wrote:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- >> bounces at opensolaris.org] On Behalf Of Toby Thain >> >> The corruption will at least be detected by a scrub, even in cases where > it >> cannot be repaired. > > Not necessarily. Let''s suppose you have some bad memory, and no ECC. Your > application does 1 + 1 = 3. Then your application writes the answer to a > file. Without ECC, the corruption happened in memory and went undetected. > Then the corruption was written to file, with a correct checksum. So in > fact it''s not filesystem corruption, and ZFS will correctly mark the > filesystem as clean and free of checksum errors. > > In conclusion: > > Use ECC if you care about your data. > Do backups if you care about your data. > > Don''t be a cheapskate, or else, don''t complain when you get bitten by lack > of adequate data protection. > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- Sent from my mobile device =================Belenix: www.belenix.org
> On Nov 15, 2010, at 8:32 AM, Bryan Horstmann-Allen wrote: > > > +-------------------------------------------------------------------- > ---------- > > | On 2010-11-15 10:21:06, Edward Ned Harvey wrote: > > | > > | Backups. > > | > > | Even if you upgrade your hardware to better stuff... with ECC and > so on ... > > | There is no substitute for backups. Period. If you care about > your data, > > | you will do backups. Period. > > > > Backups are not going to save you from bad memory writing corrupted > data to > > disk. > > > > If your RAM flips a bit and writes garbage to disk, and you back up > that > > garbage, guess what: Your backups are full of garbage. > > > > Invest in ECC RAM and hardware that is, at the least, less likely to > screw you. > > > > Test your backups to ensure you can trust them. > > The amount of resources invested trying to fix this by someone is > probably more than the costs of some ECC RAM and a MB >This is very true in the developed world. In "developing world", investing in ECC RAM and MB to go is equivalent to monthly salary of 2-8 local staff. And many times even more. As a consultant with significantly higher cost of labour it is still usually more cost-effective to throw money at work instead of hardware. Regards, Ville
michael.p.sullivan at mac.com
2010-Nov-16 10:21 UTC
[zfs-discuss] Ideas for ghetto file server data reliability?
Ummm? there''s a difference between data integrity and data corruption. Integrity is enforced programmatically by something like a DBMS. This sets up basic rules that ensure the programmer, program or algorithm adhere to a level of sanity and bounds. Corruption is where cosmic rays, bit rot, malware or some other item writes to the block level. ZFS protects systems from a lot of this by the way it''s constructed to keep metadata, checksums, and duplicates of critical data. If the filesystem is given bad data it will faithfully lay it down on disk. If that faulty data gets corrupt, ZFS will come in and save the day. Regards, Mike On Nov 16, 2010, at 11:28, Edward Ned Harvey <shill at nedharvey.com> wrote:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- >> bounces at opensolaris.org] On Behalf Of Toby Thain >> >> The corruption will at least be detected by a scrub, even in cases where > it >> cannot be repaired. > > Not necessarily. Let''s suppose you have some bad memory, and no ECC. Your > application does 1 + 1 = 3. Then your application writes the answer to a > file. Without ECC, the corruption happened in memory and went undetected. > Then the corruption was written to file, with a correct checksum. So in > fact it''s not filesystem corruption, and ZFS will correctly mark the > filesystem as clean and free of checksum errors. > > In conclusion: > > Use ECC if you care about your data. > Do backups if you care about your data. > > Don''t be a cheapskate, or else, don''t complain when you get bitten by lack > of adequate data protection. > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Miles Nordin
2010-Nov-17 19:53 UTC
[zfs-discuss] Ideas for ghetto file server data reliability?
>>>>> "sl" == Sigbjorn Lie <sigbjorn at nixtra.com> writes:sl> Do you need registered ECC, or will non-reg ECC do registered means the same thing as buffered. It has nothing to do with registering to some kind of authority---it''s a register like the accumulators inside CPU''s. The register allows more sticks per channel at the questionably-relevant cost of ``latency.'''' Lately, more than two sticks per channel seems to require registers. Your choice of motherboard (and the memory controller implied by that choice) decides whether the memory must be registered or must be unregistered, and I don''t know of any motherboards that will take both kinds (though I bet there are some out there, somewhere in history). There are other weird kinds of memory connection besides just registered and unregistered, but everything has higher latency than ``unregistered''''. None of this has anything to do with ECC, though it may sometimes seem to since both registers and ECC cost money so tightly cost-constrained systems might tend to have neither, and quantities go down and profit margins get immediately jacked up once you ask for either of the two. hth. :/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101117/28f85789/attachment.bin>