thr3ads.net - freebsd stable - ZFS... [Apr 2019]

If this information is useful, please help other people find it:
Share via:

Karl Denninger

2019-Apr-30 13:11 UTC

ZFS...

On 4/30/2019 05:14, Michelle Sullivan wrote:>> On 30 Apr 2019, at 19:50, Xin LI <delphij at gmail.com> wrote:
>>> On Tue, Apr 30, 2019 at 5:08 PM Michelle Sullivan <michelle at
sorbs.net> wrote:
>>> but in my recent experience 2 issues colliding at the same time
results in disaster
>> Do we know exactly what kind of corruption happen to your pool?  If you
see it twice in a row, it might suggest a software bug that should be
investigated.
>>
>> All I know is it?s a checksum error on a meta slab (122) and from what
I can gather it?s the spacemap that is corrupt... but I am no expert.  I don?t
believe it?s a software fault as such, because this was cause by a hard outage
(damaged UPSes) whilst resilvering a single (but completely failed) drive. 
...and after the first outage a second occurred (same as the first but more
damaging to the power hardware)... the host itself was not damaged nor were the
drives or controller.
.....>> Note that ZFS stores multiple copies of its essential metadata, and in
my experience with my old, consumer grade crappy hardware (non-ECC RAM, with
several faulty, single hard drive pool: bad enough to crash almost monthly and
damages my data from time to time),
> This was a top end consumer grade mb with non ecc ram that had been running
for 8+ years without fault (except for hard drive platter failures.). Uptime
would have been years if it wasn?t for patching.
Yuck.

I'm sorry, but that may well be what nailed you.

ECC is not just about the random cosmic ray.? It also saves your bacon
when there are power glitches.

Unfortunately however there is also cache memory on most modern hard
drives, most of the time (unless you explicitly shut it off) it's on for
write caching, and it'll nail you too.? Oh, and it's never, in my
experience, ECC.

In addition, however, and this is something I learned a LONG time ago
(think Z-80 processors!) is that as in so many very important things
"two is one and one is none."

In other words without a backup you WILL lose data eventually, and it
WILL be important.

Raidz2 is very nice, but as the name implies it you have two
redundancies.? If you take three errors, or if, God forbid, you *write*
a block that has a bad checksum in it because it got scrambled while in
RAM, you're dead if that happens in the wrong place.
> Yeah.. unlike UFS that has to get really really hosed to restore from
backup with nothing recoverable it seems ZFS can get hosed where issues occur in
just the wrong bit... but mostly it is recoverable (and my experience has been
some nasty shit that always ended up being recoverable.)
>
> Michelle 
Oh that is definitely NOT true.... again, from hard experience,
including (but not limited to) on FreeBSD.

My experience is that ZFS is materially more-resilient but there is no
such thing as "can never be corrupted by any set of events."? Backup
strategies for moderately large (e.g. many Terabytes) to very large
(e.g. Petabytes and beyond) get quite complex but they're also very
necessary.

-- 
Karl Denninger
karl at denninger.net <mailto:karl at denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4897 bytes
Desc: S/MIME Cryptographic Signature
URL:
<http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20190430/36ed2f06/attachment.bin>

Michelle Sullivan

2019-Apr-30 13:30 UTC

head link

ZFS...

Karl Denninger wrote:> On 4/30/2019 05:14, Michelle Sullivan wrote:
>>> On 30 Apr 2019, at 19:50, Xin LI <delphij at gmail.com>
wrote:
>>>> On Tue, Apr 30, 2019 at 5:08 PM Michelle Sullivan <michelle
at sorbs.net> wrote:
>>>> but in my recent experience 2 issues colliding at the same time
results in disaster
>>> Do we know exactly what kind of corruption happen to your pool?  If
you see it twice in a row, it might suggest a software bug that should be
investigated.
>>>
>>> All I know is it?s a checksum error on a meta slab (122) and from
what I can gather it?s the spacemap that is corrupt... but I am no expert.  I
don?t believe it?s a software fault as such, because this was cause by a hard
outage (damaged UPSes) whilst resilvering a single (but completely failed)
drive.  ...and after the first outage a second occurred (same as the first but
more damaging to the power hardware)... the host itself was not damaged nor were
the drives or controller.
> .....
>>> Note that ZFS stores multiple copies of its essential metadata, and
in my experience with my old, consumer grade crappy hardware (non-ECC RAM, with
several faulty, single hard drive pool: bad enough to crash almost monthly and
damages my data from time to time),
>> This was a top end consumer grade mb with non ecc ram that had been
running for 8+ years without fault (except for hard drive platter failures.).
Uptime would have been years if it wasn?t for patching.
> Yuck.
>
> I'm sorry, but that may well be what nailed you.
>
> ECC is not just about the random cosmic ray.  It also saves your bacon
> when there are power glitches.
No. Sorry no.  If the data is only half to disk, ECC isn't going to save 
you at all... it's all about power on the drives to complete the
write.>
> Unfortunately however there is also cache memory on most modern hard
> drives, most of the time (unless you explicitly shut it off) it's on
for
> write caching, and it'll nail you too.  Oh, and it's never, in my
> experience, ECC.
No comment on that - you're right in the first part, I can't comment if 
there are drives with ECC.
>
> In addition, however, and this is something I learned a LONG time ago
> (think Z-80 processors!) is that as in so many very important things
> "two is one and one is none."
>
> In other words without a backup you WILL lose data eventually, and it
> WILL be important.
>
> Raidz2 is very nice, but as the name implies it you have two
> redundancies.  If you take three errors, or if, God forbid, you *write*
> a block that has a bad checksum in it because it got scrambled while in
> RAM, you're dead if that happens in the wrong place.
Or in my case you write part data therefore invalidating the
checksum...>
>> Yeah.. unlike UFS that has to get really really hosed to restore from
backup with nothing recoverable it seems ZFS can get hosed where issues occur in
just the wrong bit... but mostly it is recoverable (and my experience has been
some nasty shit that always ended up being recoverable.)
>>
>> Michelle
> Oh that is definitely NOT true.... again, from hard experience,
> including (but not limited to) on FreeBSD.
>
> My experience is that ZFS is materially more-resilient but there is no
> such thing as "can never be corrupted by any set of events."
The latter part is true - and my blog and my current situation is not 
limited to or aimed at FreeBSD specifically,  FreeBSD is my experience.  
The former part... it has been very resilient, but I think (based on 
this certain set of events) it is easily corruptible and I have just 
been lucky.  You just have to hit a certain write to activate the issue, 
and whilst that write and issue might be very very difficult (read: hit 
and miss) to hit in normal every day scenarios it can and will 
eventually happen.
>    Backup
> strategies for moderately large (e.g. many Terabytes) to very large
> (e.g. Petabytes and beyond) get quite complex but they're also very
> necessary.
>and there in lies the problem.  If you don't have a many 10's of 
thousands of dollars backup solutions, you're either:

1/ down for a looooong time.
2/ losing all data and starting again...

..and that's the problem... ufs you can recover most (in most 
situations) and providing the *data* is there uncorrupted by the fault 
you can get it all off with various tools even if it is a complete 
mess....  here I am with the data that is apparently ok, but the 
metadata is corrupt (and note: as I had stopped writing to the drive 
when it started resilvering the data - all of it - should be intact... 
even if a mess.)

Michelle

-- 
Michelle Sullivan
http://www.mhix.org/

Daniel Kalchev

2019-Apr-30 13:40 UTC

head link

ZFS...

> On 30 Apr 2019, at 16:11, Karl Denninger <karl at denninger.net>
wrote:
> 
> 
> My experience is that ZFS is materially more-resilient but there is no
> such thing as "can never be corrupted by any set of events." 
Backup
> strategies for moderately large (e.g. many Terabytes) to very large
> (e.g. Petabytes and beyond) get quite complex but they're also very
> necessary.
> 
I can only second that statement. Being paranoid with your data (keep many
copies, have many backups) is never enough.

A colleague just complained the other day, that they lost a zpool and that ZFS
didn?t save their data?. by not making a redundant pool and the hard drive 
trashing heads. And no backups. The unreadable part of the drive happened in
metadata and the pool can not be imported.

I keep an HDD around, that since it was brand new, runs perfectly under any OS.
Rock solid, that is? and only ZFS complains that it reads things back it didn?t
write. Before that, I would think UFS was ok? since then, I don?t build a single
installation, that does not have at least a mirrored ZFS pool. And ?archive
servers? (stands for backup) have become the central focus of my work. These are
never enough..

Daniel

freebsd stable - Apr 2019 - ZFS...

ZFS...

ZFS...

ZFS...