Hi, I have a zpool with 3 mirrors containing each 2 drivers, I replaced one of the drives doing: zpool offline tank-nfs c9t7d0 Then I replaced that drive and started a rebuild using: zpool replace tank-nfs c9t7d0 All drives in the pool are Seagate ES.2 1TB SAS drives After 15 hours it showed 100% done, 0h0m to go (and ca. 150G resilvered), and now now approx 40hours later it still shows: Last login: Sat Oct 17 14:48:14 2009 from 192.168.251.49 Sun Microsystems Inc. SunOS 5.11 snv_118 November 2008 admin at nas01.securedomainservice.net:~$ zpool status -xv pool: tank-nfs state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 65h24m, 100,00% done, 0h0m to go config: NAME STATE READ WRITE CKSUM tank-nfs DEGRADED 0 0 0 mirror DEGRADED 0 0 0 c9t6d0 ONLINE 0 0 0 replacing DEGRADED 2,50M 0 0 c9t7d0s0/o FAULTED 0 0 0 corrupted data c9t7d0 ONLINE 0 0 2,50M 1,59T resilvered mirror ONLINE 0 0 0 c9t8d0 ONLINE 0 0 0 c9t9d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c9t10d0 ONLINE 0 0 0 c9t11d0 ONLINE 0 0 0 cache c8d1 ONLINE 0 0 0 errors: No known data errors Is this normal and will it complete ? - Rasmus Fauske
I Too have seen this problem. I had done a zfs send from my main pool "terra" (6 disk raidz on seagate 1TB drives) to a mirror pair of WD Green 1TB drives. ZFS send was successful, however i noticed the pool was degraded after a while (~1 week) with one of the mirror disks constantly re-silvering (40 TB resilvered on a 1TB disk) something was fishy. I removed the disk that was getting the re-silver and replaced it with another WD Green 1TB (factory new) and added it as a mirror to the pool again it re-silvered successfully. i performed a scrub the next day (couple of reboots etc) and it started re-silvering the replaced drive. I still had most of the data in the original pool, i performed a md5sum against some of the original files (~20GB files) and the ex-mirror copy and the md5 sums came back the same. I have since blown away the ex-mirror and re-created the zpool mirror and copied the data back. i have not seen this occur since the new zpool. I have been running the dev build on 2010.1 -- This message posted from opensolaris.org
On 18-Oct-09, at 6:41 AM, Adam Mellor wrote:> I Too have seen this problem. > > I had done a zfs send from my main pool "terra" (6 disk raidz on > seagate 1TB drives) to a mirror pair of WD Green 1TB drives. > ZFS send was successful, however i noticed the pool was degraded > after a while (~1 week) with one of the mirror disks constantly re- > silvering (40 TB resilvered on a 1TB disk) something was fishy. > > I removed the disk that was getting the re-silver and replaced it > with another WD Green 1TB (factory new) and added it as a mirror to > the pool again it re-silvered successfully. i performed a scrub the > next day (couple of reboots etc) and it started re-silvering the > replaced drive. > > I still had most of the data in the original pool, i performed a > md5sum against some of the original files (~20GB files) and the ex- > mirror copy and the md5 sums came back the same.This doesn''t test much; ZFS will use whichever side of the mirror is good. --Toby> > I have since blown away the ex-mirror and re-created the zpool > mirror and copied the data back....
Hi, Now I have tried to restart the resilvering by detaching c9t7d0 and then attaching it again to the mirror, then the resilvering starts but now after almost 24 hours it is still going. From the iostat it still shows data flowing: tank-nfs 446G 2,28T 112 8 13,5M 35,9K mirror 145G 783G 107 2 13,4M 12,0K c9t6d0 - - 106 2 13,3M 12,0K c9t7d0 - - 0 110 0 13,4M $zpool status -xv pool: tank-nfs state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 23h23m, 100,00% done, 0h0m to go config: NAME STATE READ WRITE CKSUM tank-nfs ONLINE 0 0 0 mirror ONLINE 0 0 0 c9t6d0 ONLINE 0 0 0 c9t7d0 ONLINE 0 0 0 1,02T resilvered This time as you can see it is 0 checksum errors during the resilvering. Is it something with the build I am using ? (118) -- Rasmus Fauske Markus Kovero skrev:> We''ve noticed this behaviour when theres problem with ram (plenty of checksum errors) and on these cases I doubt resilver will ever finnish, you can use iostat to monitor if theres anything happening on disk that should be resilvered, if not, Id say data wanted to resilver is somehow gone bad due broken ram or who knows. > (this is actually resilver and checksumming working as it should, no data that is not valid should be written) > > Yours > Markus Kovero > > >