thr3ads.net - zfs discuss - [zfs-discuss] reslivering [Nov 2007]

If this information is useful, please help other people find it:
Share via:

Tim Cook

2007-Nov-20 02:27 UTC

[zfs-discuss] reslivering

So... issues with reslivering yet again.  This is ~3TB pool.  I have one raid-z
of 5 500GB disks, and a second pool of 3 300GB disks.  One of the 300GB disks
failed, so I have replaced the drive.  After doing the resliver, it takes
approximately 5 minutes for it to complete 68.05% of the reslivering... then it
appears to just hang.  It''s been this way for 30 hours now.

If I do a zpool status, the command does not finish, it just hangs after
presenting the scrub: resliver in progress, 68.05% done

If I do a zpool iostat, it shows zero disk activity.  If I do a zpool iostat -v,
the command hangs as well.

There''s 0 activity to this pool as I suspended all shares while doing
the reslivering in hopes it would speed things up.  I''ve seen previous
threads about how slow this can be, but this is a bit ridiculous.  At this point
I''m afraid to stick anymore data into a zpool if one disk failing will
take weeks to rebuild.
 
 
This message posted from opensolaris.org

Tim Cook

2007-Nov-20 04:27 UTC

head link

[zfs-discuss] reslivering

After messing around... who knows what''s going on with it now.  Finally
rebooted because I was sick of it hanging.  After that, this is what it came
back with:


root:=> zpool status
  pool: fserv
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 0.00% done, 87h56m to go
config:

        NAME                        STATE     READ WRITE CKSUM
        fserv                       DEGRADED     0     0     0
          raidz1                    ONLINE       0     0     0
            c4t0d0                  ONLINE       0     0     0
            c4t1d0                  ONLINE       0     0     0
            c4t2d0                  ONLINE       0     0     0
            c4t3d0                  ONLINE       0     0     0
            c4t4d0                  ONLINE       0     0     0
          raidz1                    DEGRADED     0     0     0
            c4t6d0                  ONLINE       0     0     0
            c4t7d0                  ONLINE       0     0     0
            replacing               DEGRADED     0     0     0
              12544952246745011915  FAULTED      0     0     0  was
/dev/dsk/c4t5d0s0/old
              c4t5d0                ONLINE       0     0     0





root:=> zpool iostat -v
                               capacity     operations    bandwidth
pool                         used  avail   read  write   read  write
--------------------------  -----  -----  -----  -----  -----  -----
fserv                        990G  2.11T    397     25  25.3M   101K
  raidz1                     866G  1.42T    201      1   804K  5.29K
    c4t0d0                      -      -    133      1   533K  9.86K
    c4t1d0                      -      -    133      1   544K  9.89K
    c4t2d0                      -      -    133      1   541K  9.88K
    c4t3d0                      -      -    133      1   535K  9.82K
    c4t4d0                      -      -    132      1   525K  9.86K
  raidz1                     124G   708G    196     23  24.5M  95.6K
    c4t6d0                      -      -    102     31  12.3M  84.1K
    c4t7d0                      -      -    102     31  12.3M  83.8K
    replacing                   -      -      0     42      0  1.48M
      12544952246745011915      -      -      0      0  1.67K      0
      c4t5d0                    -      -      0     25  2.04K  1.50M
--------------------------  -----  -----  -----  -----  -----  -----
 
 
This message posted from opensolaris.org

Tim Cook

2007-Nov-20 04:48 UTC

head link

[zfs-discuss] reslivering

That locked up pretty quickly as well, one more reboot and this is what
I''m seeing now:

root:=> zpool status
  pool: fserv
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 1.81% done, 0h19m to go
config:

        NAME              STATE     READ WRITE CKSUM
        fserv             DEGRADED     0     0     0
          raidz1          ONLINE       0     0     0
            c4t0d0        ONLINE       0     0     0
            c4t1d0        ONLINE       0     0     0
            c4t2d0        ONLINE       0     0     0
            c4t3d0        ONLINE       0     0     0
            c4t4d0        ONLINE       0     0     0
          raidz1          DEGRADED     0     0     0
            c4t6d0        ONLINE       0     0     0
            c4t7d0        ONLINE       0     0     0
            replacing     DEGRADED     0     0     0
              c4t5d0s0/o  FAULTED      0     0     0  corrupted data
              c4t5d0      ONLINE       0     0     0

errors: No known data errors


The "corrupted data" seems a bit scary to me...
 
 
This message posted from opensolaris.org

zfs discuss - Nov 2007 - reslivering

[zfs-discuss] reslivering

[zfs-discuss] reslivering

[zfs-discuss] reslivering