thr3ads.net - freebsd stable - ZFS... [Apr 2019]

If this information is useful, please help other people find it:
Share via:

Michelle Sullivan

2019-Apr-29 15:19 UTC

ZFS...

I know I'm not going to be popular for this, but I'll just drop it here 
anyhow.

http://www.michellesullivan.org/blog/1726

Perhaps one should reconsider either:

1. Looking at tools that may be able to recover corrupt ZFS metadata, or
2. Defaulting to non ZFS filesystems on install.

-- 
Michelle Sullivan
http://www.mhix.org/

Alan Somers

2019-Apr-29 17:06 UTC

head link

ZFS...

On Mon, Apr 29, 2019 at 10:23 AM Michelle Sullivan <michelle at sorbs.net>
wrote:>
> I know I'm not going to be popular for this, but I'll just drop it
here
> anyhow.
>
> http://www.michellesullivan.org/blog/1726
>
> Perhaps one should reconsider either:
>
> 1. Looking at tools that may be able to recover corrupt ZFS metadata, or
> 2. Defaulting to non ZFS filesystems on install.
>
> --
> Michelle Sullivan
> http://www.mhix.org/
Wow, losing multiple TB sucks for anybody.  I'm sorry for your loss.
But I want to respond to a few points from the blog post.

1) When ZFS says that "the data is always correct and there's no need
for fsck", they mean metadata as well as data.  The spacemap is
protected in exactly the same way as all other data and metadata. (to
be pedantically correct, the labels and uberblocks are protected in a
different way, but still protected).  The only way to get metadata
corruption is due a disk failure (3-disk failure when using RAIDZ2),
or due to a software bug.  Sadly, those do happen, and they're
devilishly tricky to track down.  The difference between ZFS and older
filesystems is that older filesystems experience corruption during
power loss _by_design_, not merely due to software bugs.  A perfectly
functioning UFS implementation will experience corruption during power
loss, and that's why it needs to be fscked.  It's not just
theoretical, either.  I use UFS on my development VMs, and they
frequently experience corruption after a panic (which happens all the
time because I'm working on kernel code).

2) Backups are essential with any filesystem, not just ZFS.  After
all, no amount of RAID will protect you from an accidental "rm -rf /".

3) ZFS hotspares can be swapped in automatically, though they don't be
default.  It sounds like you already figured out how to assign a spare
to the pool.  To use it automatically, you must set the "autoreplace"
pool property and enable zfsd.  The latter can be done with "sysrc
zfsd_enable="YES"".

4) It sounds like you're having a lot of power trouble.  Have you
tried sysutils/apcupsd from ports?  It's fairly handy.  It can talk to
a wide range of UPSes, and can be configured to do stuff like send you
an email on power loss, and power down the server if the battery gets
too low.

Better luck next time,
-Alan

Kurt Jaeger

2019-Apr-29 17:13 UTC

head link

ZFS...

Hi!
> I know I'm not going to be popular for this, but I'll just drop it
here
> anyhow.
> 
> http://www.michellesullivan.org/blog/1726
With all due respect, I think if that filesystem/server you describe
has not kept with all those mishaps, I think it's not perfect, but
nothing is.
> Perhaps one should reconsider either:
> 
> 2. Defaulting to non ZFS filesystems on install.
I had more cases of UFS being toast than ZFS until now.
> 1. Looking at tools that may be able to recover corrupt ZFS metadata, or
Here I agree! Making tools available to dig around zombie zpools,
which is icky in itself, would be helpful!

-- 
pi at opsec.eu            +49 171 3101372                    One year to go !

Chris

2019-May-01 16:16 UTC

head link

ZFS...

Your story is so unusual I am wondering if its not fiction, I mean all
sorts of power cuts where it just so happens the UPS fails every time,
then you decide to ship a server halfway round the world, and on top
of that you get a way above average rate of hard drive failures.  But
aside from all this you managed to recover multiple times.

ZFS is never claimed to be a get out of jail free card, but it did
survive in your case multiple times, I suggest tho if you value
redundancy, do not use RAIDZ but use Mirror instead.  I dont know why
people keep persisting with raid 5/6 now days with drives as large as
they are.

I have used ZFS since the days of FreeBSD 8.x and its resilience
compared to the likes of ext is astounding and especially compared to
UFS.

Before marking it down think how would UFS or ext have managed the
scenarios you presented in your blog.

Also think about where you hosting your data with all your power
failures and the UPS equipment you utilise as well.

On Mon, 29 Apr 2019 at 16:26, Michelle Sullivan <michelle at sorbs.net>
wrote:>
> I know I'm not going to be popular for this, but I'll just drop it
here
> anyhow.
>
> http://www.michellesullivan.org/blog/1726
>
> Perhaps one should reconsider either:
>
> 1. Looking at tools that may be able to recover corrupt ZFS metadata, or
> 2. Defaulting to non ZFS filesystems on install.
>
> --
> Michelle Sullivan
> http://www.mhix.org/
>
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at
freebsd.org"

freebsd stable - Apr 2019 - ZFS...

ZFS...

ZFS...

ZFS...

ZFS...