Hi,
I have a zpool with 3 mirrors containing each 2 drivers, I replaced one
of the drives doing:
zpool offline tank-nfs c9t7d0
Then I replaced that drive and started a rebuild using:
zpool replace tank-nfs c9t7d0
All drives in the pool are Seagate ES.2 1TB SAS drives
After 15 hours it showed 100% done, 0h0m to go (and ca. 150G
resilvered), and now now approx 40hours later it still shows:
Last login: Sat Oct 17 14:48:14 2009 from 192.168.251.49
Sun Microsystems Inc. SunOS 5.11 snv_118 November 2008
admin at nas01.securedomainservice.net:~$ zpool status -xv
pool: tank-nfs
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress for 65h24m, 100,00% done, 0h0m to go
config:
NAME STATE READ WRITE CKSUM
tank-nfs DEGRADED 0 0 0
mirror DEGRADED 0 0 0
c9t6d0 ONLINE 0 0 0
replacing DEGRADED 2,50M 0 0
c9t7d0s0/o FAULTED 0 0 0 corrupted data
c9t7d0 ONLINE 0 0 2,50M 1,59T resilvered
mirror ONLINE 0 0 0
c9t8d0 ONLINE 0 0 0
c9t9d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c9t10d0 ONLINE 0 0 0
c9t11d0 ONLINE 0 0 0
cache
c8d1 ONLINE 0 0 0
errors: No known data errors
Is this normal and will it complete ?
-
Rasmus Fauske
I Too have seen this problem. I had done a zfs send from my main pool "terra" (6 disk raidz on seagate 1TB drives) to a mirror pair of WD Green 1TB drives. ZFS send was successful, however i noticed the pool was degraded after a while (~1 week) with one of the mirror disks constantly re-silvering (40 TB resilvered on a 1TB disk) something was fishy. I removed the disk that was getting the re-silver and replaced it with another WD Green 1TB (factory new) and added it as a mirror to the pool again it re-silvered successfully. i performed a scrub the next day (couple of reboots etc) and it started re-silvering the replaced drive. I still had most of the data in the original pool, i performed a md5sum against some of the original files (~20GB files) and the ex-mirror copy and the md5 sums came back the same. I have since blown away the ex-mirror and re-created the zpool mirror and copied the data back. i have not seen this occur since the new zpool. I have been running the dev build on 2010.1 -- This message posted from opensolaris.org
On 18-Oct-09, at 6:41 AM, Adam Mellor wrote:> I Too have seen this problem. > > I had done a zfs send from my main pool "terra" (6 disk raidz on > seagate 1TB drives) to a mirror pair of WD Green 1TB drives. > ZFS send was successful, however i noticed the pool was degraded > after a while (~1 week) with one of the mirror disks constantly re- > silvering (40 TB resilvered on a 1TB disk) something was fishy. > > I removed the disk that was getting the re-silver and replaced it > with another WD Green 1TB (factory new) and added it as a mirror to > the pool again it re-silvered successfully. i performed a scrub the > next day (couple of reboots etc) and it started re-silvering the > replaced drive. > > I still had most of the data in the original pool, i performed a > md5sum against some of the original files (~20GB files) and the ex- > mirror copy and the md5 sums came back the same.This doesn''t test much; ZFS will use whichever side of the mirror is good. --Toby> > I have since blown away the ex-mirror and re-created the zpool > mirror and copied the data back....
Hi,
Now I have tried to restart the resilvering by detaching c9t7d0 and then
attaching it again to the mirror, then the resilvering starts but now
after almost 24 hours it is still going.
From the iostat it still shows data flowing:
tank-nfs 446G 2,28T 112 8 13,5M 35,9K
mirror 145G 783G 107 2 13,4M 12,0K
c9t6d0 - - 106 2 13,3M 12,0K
c9t7d0 - - 0 110 0 13,4M
$zpool status -xv
pool: tank-nfs
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress for 23h23m, 100,00% done, 0h0m to go
config:
NAME STATE READ WRITE CKSUM
tank-nfs ONLINE 0 0 0
mirror ONLINE 0 0 0
c9t6d0 ONLINE 0 0 0
c9t7d0 ONLINE 0 0 0 1,02T resilvered
This time as you can see it is 0 checksum errors during the resilvering.
Is it something with the build I am using ? (118)
--
Rasmus Fauske
Markus Kovero skrev:> We''ve noticed this behaviour when theres problem with ram (plenty
of checksum errors) and on these cases I doubt resilver will ever finnish, you
can use iostat to monitor if theres anything happening on disk that should be
resilvered, if not, Id say data wanted to resilver is somehow gone bad due
broken ram or who knows.
> (this is actually resilver and checksumming working as it should, no data
that is not valid should be written)
>
> Yours
> Markus Kovero
>
>
>