thr3ads.net - similar to: "possible resilver bugs"

Displaying 20 results from an estimated 4000 matches similar to: "possible resilver bugs"

Degraded zpool won''t online disk device, instead resilvers spare

2007 Dec 12

Degraded zpool won''t online disk device, instead resilvers spare

I''ve got a zpool that has 4 raidz2 vdevs each with 4 disks (750GB), plus 4 spares. At one point 2 disks failed (in different vdevs). The message in /var/adm/messages for the disks were ''device busy too long''. Then SMF printed this message: Nov 23 04:23:51 x.x.com EVENT-TIME: Fri Nov 23 04:23:51 EST 2007 Nov 23 04:23:51 x.x.com PLATFORM: Sun Fire X4200 M2, CSN:

never ending resilver

2010 Jul 05

never ending resilver

Hi list, Here''s my case : pool: mypool state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 147h19m, 100.00% done, 0h0m to go config: NAME STATE READ WRITE CKSUM filerbackup13

zpool degraded status after resilver completed

2007 Sep 08

zpool degraded status after resilver completed

I am curious why zpool status reports a pool to be in the DEGRADED state after a drive in a raidz2 vdev has been successfully replaced. In this particular case drive c0t6d0 was failing so I ran, zpool offline home/c0t6d0 zpool replace home c0t6d0 c8t1d0 and after the resilvering finished the pool reports a degraded state. Hopefully this is incorrect. At this point is the vdev in question now has

Checksum errors on and after resilver

2010 Apr 14

Checksum errors on and after resilver

Hi all, I recently experienced a disk failure on my home server and observed checksum errors while resilvering the pool and on the first scrub after the resilver had completed. Now everything seems fine but I''m posting this to get help with calming my nerves and detect any possible future faults. Lets start with some specs. OSOL 2009.06 Intel SASUC8i (w LSI 1.30IT FW) Gigabyte

raidz2 another resilver problem

2007 Apr 11

raidz2 another resilver problem

Hello zfs-discuss, One of a disk started to behave strangely. Apr 11 16:07:42 thumper-9.srv sata: [ID 801593 kern.notice] NOTICE: /pci at 1,0/pci1022,7458 at 3/pci11ab,11ab at 1: Apr 11 16:07:42 thumper-9.srv port 6: device reset Apr 11 16:07:42 thumper-9.srv scsi: [ID 107833 kern.warning] WARNING: /pci at 1,0/pci1022,7458 at 3/pci11ab,11ab at 1/disk at 6,0 (sd27): Apr 11 16:07:42 thumper-9.srv

internal scrub keeps restarting resilvering?

2009 Oct 30

internal scrub keeps restarting resilvering?

After several days of trying to get a 1.5TB drive to resilver and it continually restarting, I eliminated all of the snapshot-taking facilities which were enabled and 2009-10-29.14:58:41 [internal pool scrub done txg:567780] complete=0 2009-10-29.14:58:41 [internal pool scrub txg:567780] func=1 mintxg=3 maxtxg=567354 2009-10-29.16:52:53 [internal pool scrub done txg:567999] complete=0

Slow Resilvering Performance

2009 Jul 10

Slow Resilvering Performance

I know this topic has been discussed many times... but what the hell makes zpool resilvering so slow? I''m running OpenSolaris 2009.06. I have had a large number of problematic disks due to a bad production batch, leading me to resilver quite a few times, progressively replacing each disk as it dies (and now preemptively removing disks.) My complaint is that resilvering ends up

Recovering from kernel panic / reboot cycle importing pool.

2011 Nov 25

Recovering from kernel panic / reboot cycle importing pool.

Yesterday morning I awoke to alerts from my SAN that one of my OS disks was faulty, FMA said it was in hardware failure. By the time I got to work (1.5 hours after the email) ALL of my pools were in a degraded state, and "tank" my primary pool had kicked in two hot spares because it was so discombobulated. ------------------- EMAIL ------------------- List of faulty resources:

x4500 resilvering spare taking forever?

2009 Jun 19

x4500 resilvering spare taking forever?

I''ve got a Thumper running snv_57 and a large ZFS pool. I recently noticed a drive throwing some read errors, so I did the right thing and zfs replaced it with a spare. Everything went well, but the resilvering process seems to be taking an eternity: # zpool status pool: bigpool state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was

resilver question

2010 Oct 16

resilver question

Hi all I''m seeing some rather bad resilver times for a pool of WD Green drives (I know, bad drives, but leave that). Does resilver go through the whole pool or just the VDEV in question? -- Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres

Resliver making the system unresponsive

2010 Sep 29

Resliver making the system unresponsive

This must be resliver day :) I just had a drive failure. The hot spare kicked in, and access to the pool over NFS was effectively zero for about 45 minutes. Currently the pool is still reslivering, but for some reason I can access the file system now. Resliver speed has been beaten to death I know, but is there a way to avoid this? For example, is more enterprisy hardware less susceptible to

zpool status keeps telling "resilvered"

2010 Dec 28

zpool status keeps telling "resilvered"

Hi! We have a raidz2 pool with 1 spare. Recently, one of the drives generated a lot of checksum errors, so it was automatically replaced with the spare. Since the errors stopped at some point, we figured that the drive itself was not at fault. We offlined it, zeroed it and onlined it again, started resilvering, and manually detached the spare drive. The zpool status is ONLINE and mentions

Zfs ignoring spares?

2010 Dec 05

Zfs ignoring spares?

Hi all I have installed a new server with 77 2TB drives in 11 7-drive RAIDz2 VDEVs, all on WD Black drives. Now, it seems two of these drives were bad, one of them had a bunch of errors, the other was very slow. After zfs offlining these and then zfs replacing them with online spares, resilver ended and I thought it''d be ok. Appearently not. Albeit the resilver succeeds, the pool status

OpenSolaris 2008.11 - resilver still restarting

2009 Jul 13

OpenSolaris 2008.11 - resilver still restarting

Just look at this. I thought all the restarting resilver bugs were fixed, but it looks like something odd is still happening at the start: Status immediately after starting resilver: # zpool status pool: rc-pool state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine

ZFS RAID-Z2 degraded vs RAID-Z1

2010 Apr 24

ZFS RAID-Z2 degraded vs RAID-Z1

Had an idea, could someone please tell me why it''s wrong? (I feel like it has to be). A RaidZ-2 pool with one missing disk offers the same failure resilience as a healthy RaidZ1 pool (no data loss when one disk fails). I had initially wanted to do single parity raidz pool (5disk), but after a recent scare decided raidz2 was the way to go. With the help of a sparse file

Dirves going offline in Zpool

2013 Mar 23

Dirves going offline in Zpool

Hi, I have Dell md1200 connected to two heads ( Dell R710 ). The heads have Perc H800 card and drives are configured in Raid0 ( Virtual Disk) in the RAID controller. One of the drives had crashed and is replaced by a spare. Resilvering was triggered but fails to complete due to drives going offline. I have to reboot the head ( R710) and drives comes online. This happened repeatedly when

How to unstick ZFS resilver?

2013 Oct 15

How to unstick ZFS resilver?

I have a large (88-drive) zpool in which a drive was recently replaced. (The pool has a bunch of duff Toshiba MK2001TRKB drives -- never ever pay money for these! -- and I'm trying to replace them one by one before they fail completely.) The resilver on the first drive replacement has been taking much much too long, and currently it's stuck in this state: pool: export state: DEGRADED

checksum errors increasing on "spare" vdev?

2010 Mar 17

checksum errors increasing on "spare" vdev?

Hi, One of my colleagues was confused by the output of ''zpool status'' on a pool where a hot spare is being resilvered in after a drive failure: $ zpool status data pool: data state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub:

Is there any way to stop a resilver?

2010 Sep 29

Is there any way to stop a resilver?

Is there any way to stop a resilver? We gotta stop this thing - at minimum, completion time is 300,000 hours, and maximum is in the millions. Raidz2 array, so it has the redundancy, we just need to get data off. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100929/9dbb6cf5/attachment.html>

Data corruption during resilver operation

2009 Mar 30

Data corruption during resilver operation

I''m in well over my head with this report from zpool status saying: root # zpool status z3 pool: z3 state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A

similar to: possible resilver bugs