thr3ads.net - freebsd stable - ZFS... [May 2019]

If this information is useful, please help other people find it:
Share via:

Michelle Sullivan

2019-Apr-30 09:05 UTC

ZFS...

Michelle Sullivan
http://www.mhix.org/
Sent from my iPad
> On 30 Apr 2019, at 18:44, rainer at ultra-secure.de wrote:
> 
> Am 2019-04-30 10:09, schrieb Michelle Sullivan:
> 
>> Now, yes most production environments have multiple backing stores so
>> will have a server or ten to switch to whilst the store is being
>> recovered, but it still wouldn?t be a pleasant experience... not to
>> mention the possibility that if one store is corrupted there is a
>> chance that the other store(s) would also be affected in the same way
>> if in the same DC... (Eg a DC fire - which I have seen) .. and if you
>> have multi DC stores to protect from that.. size of the pipes between
>> DCs comes clearly into play.
> 
> 
> I have one customer with about 13T of ZFS - and because it would take a
while to restore (actual backups), it zfs-sends delta-snapshots every hour to a
standby-system.
> 
> It was handy when we had to rebuild the system with different HBAs.
> 
> 
I wonder what would happen if you scaled that up by just 10 (storage) and had
the master blow up where it needs to be restored from backup.. how long would
one be praying to higher powers that there is no problem with the backup...? (As
in no outage or error causing a complete outAge.)... don?t get me wrong.. we all
get to that position at sometime, but in my recent experience 2 issues colliding
at the same time results in disaster.  13T is really not something I have issues
with as I can usually cobble something together with 16T.. (at least until 6T
drives became a viable (cost and availability at short notice) option...  even
10T is becoming easier to get a hold of now.. but I have a measly 96T here and
it takes weeks even with gigabit bonded interfaces when I need to restore.

rainer at ultra-secure.de

2019-Apr-30 09:23 UTC

head link

ZFS...

Am 2019-04-30 11:05, schrieb Michelle Sullivan:
> I wonder what would happen if you scaled that up by just 10 (storage)
> and had the master blow up where it needs to be restored from backup..
> how long would one be praying to higher powers that there is no
> problem with the backup...?

Well, the backup itself takes over a day, AFAIK.

I'm not sure why it is that slow. Maybe the old SAS6 HBAs.

It's also lots of files.
> (As in no outage or error causing a
> complete outAge.)... don?t get me wrong.. we all get to that position
> at sometime, but in my recent experience 2 issues colliding at the
> same time results in disaster.  13T is really not something I have
> issues with as I can usually cobble something together with 16T.. (at
> least until 6T drives became a viable (cost and availability at short
> notice) option...  even 10T is becoming easier to get a hold of now..
> but I have a measly 96T here and it takes weeks even with gigabit
> bonded interfaces when I need to restore.
Those are all SAS drives, actually.
600 and 1200 GB.

Xin LI

2019-Apr-30 09:50 UTC

head link

ZFS...

On Tue, Apr 30, 2019 at 5:08 PM Michelle Sullivan <michelle at sorbs.net>
wrote:
> but in my recent experience 2 issues colliding at the same time results in
> disaster
>
Do we know exactly what kind of corruption happen to your pool?  If you see
it twice in a row, it might suggest a software bug that should be
investigated.

Note that ZFS stores multiple copies of its essential metadata, and in my
experience with my old, consumer grade crappy hardware (non-ECC RAM, with
several faulty, single hard drive pool: bad enough to crash almost monthly
and damages my data from time to time), I've never seen a corruption this
bad and I was always able to recover the pool.  At previous employer, the
only case that we had the pool corrupted enough to the point that mount was
not allowed was because two host nodes happen to import the pool at the
same time, which is a situation that can be avoided with SCSI reservation;
their hardware was of much better quality, though.

Speaking for a tool like 'fsck': I think I'm mostly convinced that
it's
*not* necessary, because at the point ZFS says the metadata is corrupted,
it means that these metadata was really corrupted beyond repair (all
replicas were corrupted; otherwise it would recover by finding out the
right block and rewrite the bad ones).

An interactive tool may be useful (e.g. "I saw data structure version 1, 2,
3 available, and all with bad checksum, choose which one you would want to
try"), but I think they wouldn't be very practical for use with large
data
pools -- unlike traditional filesystems, ZFS uses copy-on-write and heavily
depends on the metadata to find where the data is, and a regular
"scan" is
not really useful.

I'd agree that you need a full backup anyway, regardless what storage
system is used, though.

Paul Mather

2019-Apr-30 14:28 UTC

head link

ZFS...

On Apr 30, 2019, at 5:05 AM, Michelle Sullivan <michelle at sorbs.net>
wrote:
>
>
> Michelle Sullivan
> http://www.mhix.org/
> Sent from my iPad
>
>> On 30 Apr 2019, at 18:44, rainer at ultra-secure.de wrote:
>>
>> Am 2019-04-30 10:09, schrieb Michelle Sullivan:
>>
>>> Now, yes most production environments have multiple backing stores
so
>>> will have a server or ten to switch to whilst the store is being
>>> recovered, but it still wouldn?t be a pleasant experience... not to
>>> mention the possibility that if one store is corrupted there is a
>>> chance that the other store(s) would also be affected in the same
way
>>> if in the same DC... (Eg a DC fire - which I have seen) .. and if
you
>>> have multi DC stores to protect from that.. size of the pipes
between
>>> DCs comes clearly into play.
>>
>>
>> I have one customer with about 13T of ZFS - and because it would take a
>> while to restore (actual backups), it zfs-sends delta-snapshots every  
>> hour to a standby-system.
>>
>> It was handy when we had to rebuild the system with different HBAs.
>
> I wonder what would happen if you scaled that up by just 10 (storage) and  
> had the master blow up where it needs to be restored from backup.. how  
> long would one be praying to higher powers that there is no problem with  
> the backup...? (As in no outage or error causing a complete outAge.)...  
> don?t get me wrong.. we all get to that position at sometime, but in my  
> recent experience 2 issues colliding at the same time results in  
> disaster.  13T is really not something I have issues with as I can  
> usually cobble something together with 16T.. (at least until 6T drives  
> became a viable (cost and availability at short notice) option...  even  
> 10T is becoming easier to get a hold of now.. but I have a measly 96T  
> here and it takes weeks even with gigabit bonded interfaces when I need  
> to restore.

Such is the curse of large-scale storage when disaster befalls it.

I guess you need to invent a home brew version of Amazon Snowball or Amazon  
Snowmobile. ;-)

Cheers,

Paul.

Michelle Sullivan

2019-May-01 02:26 UTC

head link

ZFS...

Xin LI wrote:>
> On Tue, Apr 30, 2019 at 5:08 PM Michelle Sullivan <michelle at sorbs.net
> <mailto:michelle at sorbs.net>> wrote:
>
>     but in my recent experience 2 issues colliding at the same time
>     results in disaster
>
>
> Do we know exactly what kind of corruption happen to your pool?  If 
> you see it twice in a row, it might suggest a software bug that should 
> be investigated.
>Oh I did spot one interesting bug... though it is benign...


Check out the following (note the difference between 'zpool status' and 
'zpool status -v'):

root at colossus:/mnt # zpool status
   pool: storage
  state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
     continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
   scan: resilver in progress since Mon Apr 29 20:22:03 2019
     6.54T scanned at 0/s, 6.54T issued at 0/s, 28.8T total
     445G resilvered, 22.66% done, no estimated completion time
config:

     NAME        STATE     READ WRITE CKSUM
     storage     ONLINE       0     0     5
       raidz2-0  ONLINE       0     0    20
         mfid11  ONLINE       0     0     0
         mfid10  ONLINE       0     0     0
         mfid8   ONLINE       0     0     0
         mfid7   ONLINE       0     0     0
         mfid0   ONLINE       0     0     0
         mfid5   ONLINE       0     0     0
         mfid4   ONLINE       0     0     0
         mfid3   ONLINE       0     0     0
         mfid2   ONLINE       0     0     0
         mfid14  ONLINE       0     0     0
         mfid15  ONLINE       0     0     0
         mfid6   ONLINE       0     0     0
         mfid9   ONLINE       0     0     0
         mfid13  ONLINE       0     0     0
         mfid1   ONLINE       0     0     0

errors: 4 data errors, use '-v' for a list
root at colossus:/mnt # zpool status
   pool: storage
  state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
     continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
   scan: resilver in progress since Mon Apr 29 20:22:03 2019
     6.54T scanned at 0/s, 6.54T issued at 0/s, 28.8T total
     445G resilvered, 22.66% done, no estimated completion time
config:

     NAME        STATE     READ WRITE CKSUM
     storage     ONLINE       0     0     5
       raidz2-0  ONLINE       0     0    20
         mfid11  ONLINE       0     0     0
         mfid10  ONLINE       0     0     0
         mfid8   ONLINE       0     0     0
         mfid7   ONLINE       0     0     0
         mfid0   ONLINE       0     0     0
         mfid5   ONLINE       0     0     0
         mfid4   ONLINE       0     0     0
         mfid3   ONLINE       0     0     0
         mfid2   ONLINE       0     0     0
         mfid14  ONLINE       0     0     0
         mfid15  ONLINE       0     0     0
         mfid6   ONLINE       0     0     0
         mfid9   ONLINE       0     0     0
         mfid13  ONLINE       0     0     0
         mfid1   ONLINE       0     0     0

errors: 4 data errors, use '-v' for a list
root at colossus:/mnt # zpool status -v
   pool: storage
  state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
     continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
   scan: resilver in progress since Mon Apr 29 20:22:03 2019
     6.54T scanned at 0/s, 6.54T issued at 0/s, 28.8T total
     445G resilvered, 22.66% done, no estimated completion time
config:

     NAME        STATE     READ WRITE CKSUM
     storage     ONLINE       0     0     5
       raidz2-0  ONLINE       0     0    20
         mfid11  ONLINE       0     0     0
         mfid10  ONLINE       0     0     0
         mfid8   ONLINE       0     0     0
         mfid7   ONLINE       0     0     0
         mfid0   ONLINE       0     0     0
         mfid5   ONLINE       0     0     0
         mfid4   ONLINE       0     0     0
         mfid3   ONLINE       0     0     0
         mfid2   ONLINE       0     0     0
         mfid14  ONLINE       0     0     0
         mfid15  ONLINE       0     0     0
         mfid6   ONLINE       0     0     0
         mfid9   ONLINE       0     0     0
         mfid13  ONLINE       0     0     0
         mfid1   ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

         <metadata>:<0x3e>
         <metadata>:<0x5d>
         storage:<0x0>
         storage at now:<0x0>
root at colossus:/mnt # zpool status -v
   pool: storage
  state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
     continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
   scan: resilver in progress since Mon Apr 29 20:22:03 2019
     6.54T scanned at 0/s, 6.54T issued at 0/s, 28.8T total
     445G resilvered, 22.66% done, no estimated completion time
config:

     NAME        STATE     READ WRITE CKSUM
     storage     ONLINE       0     0     7
       raidz2-0  ONLINE       0     0    28
         mfid11  ONLINE       0     0     0
         mfid10  ONLINE       0     0     0
         mfid8   ONLINE       0     0     0
         mfid7   ONLINE       0     0     0
         mfid0   ONLINE       0     0     0
         mfid5   ONLINE       0     0     0
         mfid4   ONLINE       0     0     0
         mfid3   ONLINE       0     0     0
         mfid2   ONLINE       0     0     0
         mfid14  ONLINE       0     0     0
         mfid15  ONLINE       0     0     0
         mfid6   ONLINE       0     0     0
         mfid9   ONLINE       0     0     0
         mfid13  ONLINE       0     0     0
         mfid1   ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

         <metadata>:<0x3e>
         <metadata>:<0x5d>
         storage:<0x0>
         storage at now:<0x0>
root at colossus:/mnt # zpool status -v
   pool: storage
  state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
     continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
   scan: resilver in progress since Mon Apr 29 20:22:03 2019
     6.54T scanned at 0/s, 6.54T issued at 0/s, 28.8T total
     445G resilvered, 22.66% done, no estimated completion time
config:

     NAME        STATE     READ WRITE CKSUM
         mfid10  ONLINE       0     0     0
         mfid8   ONLINE       0     0     0
         mfid7   ONLINE       0     0     0
         mfid0   ONLINE       0     0     0
         mfid5   ONLINE       0     0     0
         mfid4   ONLINE       0     0     0
         mfid3   ONLINE       0     0     0
         mfid2   ONLINE       0     0     0
         mfid14  ONLINE       0     0     0
         mfid15  ONLINE       0     0     0
         mfid6   ONLINE       0     0     0
         mfid9   ONLINE       0     0     0
         mfid13  ONLINE       0     0     0
         mfid1   ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

         <metadata>:<0x3e>
         <metadata>:<0x5d>
         storage:<0x0>
         storage at now:<0x0>
root at colossus:/mnt # zpool status -v
   pool: storage
  state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
     continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
   scan: resilver in progress since Mon Apr 29 20:22:03 2019
     6.54T scanned at 0/s, 6.54T issued at 0/s, 28.8T total
     445G resilvered, 22.66% done, no estimated completion time
config:

     NAME        STATE     READ WRITE CKSUM
     storage     ONLINE       0     0    11
       raidz2-0  ONLINE       0     0    44
         mfid11  ONLINE       0     0     0
         mfid10  ONLINE       0     0     0
         mfid8   ONLINE       0     0     0
         mfid7   ONLINE       0     0     0
         mfid0   ONLINE       0     0     0
         mfid5   ONLINE       0     0     0
         mfid4   ONLINE       0     0     0
         mfid3   ONLINE       0     0     0
         mfid2   ONLINE       0     0     0
         mfid14  ONLINE       0     0     0
         mfid15  ONLINE       0     0     0
         mfid6   ONLINE       0     0     0
         mfid9   ONLINE       0     0     0
         mfid13  ONLINE       0     0     0
         mfid1   ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

         <metadata>:<0x3e>
         <metadata>:<0x5d>
         storage:<0x0>
         storage at now:<0x0>
root at colossus:/mnt #


-- 
Michelle Sullivan
http://www.mhix.org/

freebsd stable - May 2019 - ZFS...

ZFS...

ZFS...

ZFS...

ZFS...

ZFS...