similar to: possible resilver bugs

Displaying 20 results from an estimated 4000 matches similar to: "possible resilver bugs"

2007 Dec 12
0
Degraded zpool won''t online disk device, instead resilvers spare
I''ve got a zpool that has 4 raidz2 vdevs each with 4 disks (750GB), plus 4 spares. At one point 2 disks failed (in different vdevs). The message in /var/adm/messages for the disks were ''device busy too long''. Then SMF printed this message: Nov 23 04:23:51 x.x.com EVENT-TIME: Fri Nov 23 04:23:51 EST 2007 Nov 23 04:23:51 x.x.com PLATFORM: Sun Fire X4200 M2, CSN:
2010 Jul 05
5
never ending resilver
Hi list, Here''s my case : pool: mypool state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 147h19m, 100.00% done, 0h0m to go config: NAME STATE READ WRITE CKSUM filerbackup13
2007 Sep 08
1
zpool degraded status after resilver completed
I am curious why zpool status reports a pool to be in the DEGRADED state after a drive in a raidz2 vdev has been successfully replaced. In this particular case drive c0t6d0 was failing so I ran, zpool offline home/c0t6d0 zpool replace home c0t6d0 c8t1d0 and after the resilvering finished the pool reports a degraded state. Hopefully this is incorrect. At this point is the vdev in question now has
2010 Apr 14
1
Checksum errors on and after resilver
Hi all, I recently experienced a disk failure on my home server and observed checksum errors while resilvering the pool and on the first scrub after the resilver had completed. Now everything seems fine but I''m posting this to get help with calming my nerves and detect any possible future faults. Lets start with some specs. OSOL 2009.06 Intel SASUC8i (w LSI 1.30IT FW) Gigabyte
2007 Apr 11
0
raidz2 another resilver problem
Hello zfs-discuss, One of a disk started to behave strangely. Apr 11 16:07:42 thumper-9.srv sata: [ID 801593 kern.notice] NOTICE: /pci at 1,0/pci1022,7458 at 3/pci11ab,11ab at 1: Apr 11 16:07:42 thumper-9.srv port 6: device reset Apr 11 16:07:42 thumper-9.srv scsi: [ID 107833 kern.warning] WARNING: /pci at 1,0/pci1022,7458 at 3/pci11ab,11ab at 1/disk at 6,0 (sd27): Apr 11 16:07:42 thumper-9.srv
2009 Oct 30
1
internal scrub keeps restarting resilvering?
After several days of trying to get a 1.5TB drive to resilver and it continually restarting, I eliminated all of the snapshot-taking facilities which were enabled and 2009-10-29.14:58:41 [internal pool scrub done txg:567780] complete=0 2009-10-29.14:58:41 [internal pool scrub txg:567780] func=1 mintxg=3 maxtxg=567354 2009-10-29.16:52:53 [internal pool scrub done txg:567999] complete=0
2009 Jul 10
5
Slow Resilvering Performance
I know this topic has been discussed many times... but what the hell makes zpool resilvering so slow? I''m running OpenSolaris 2009.06. I have had a large number of problematic disks due to a bad production batch, leading me to resilver quite a few times, progressively replacing each disk as it dies (and now preemptively removing disks.) My complaint is that resilvering ends up
2011 Nov 25
1
Recovering from kernel panic / reboot cycle importing pool.
Yesterday morning I awoke to alerts from my SAN that one of my OS disks was faulty, FMA said it was in hardware failure. By the time I got to work (1.5 hours after the email) ALL of my pools were in a degraded state, and "tank" my primary pool had kicked in two hot spares because it was so discombobulated. ------------------- EMAIL ------------------- List of faulty resources:
2009 Jun 19
8
x4500 resilvering spare taking forever?
I''ve got a Thumper running snv_57 and a large ZFS pool. I recently noticed a drive throwing some read errors, so I did the right thing and zfs replaced it with a spare. Everything went well, but the resilvering process seems to be taking an eternity: # zpool status pool: bigpool state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was
2010 Oct 16
4
resilver question
Hi all I''m seeing some rather bad resilver times for a pool of WD Green drives (I know, bad drives, but leave that). Does resilver go through the whole pool or just the VDEV in question? -- Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres
2010 Sep 29
10
Resliver making the system unresponsive
This must be resliver day :) I just had a drive failure. The hot spare kicked in, and access to the pool over NFS was effectively zero for about 45 minutes. Currently the pool is still reslivering, but for some reason I can access the file system now. Resliver speed has been beaten to death I know, but is there a way to avoid this? For example, is more enterprisy hardware less susceptible to
2010 Dec 28
2
zpool status keeps telling "resilvered"
Hi! We have a raidz2 pool with 1 spare. Recently, one of the drives generated a lot of checksum errors, so it was automatically replaced with the spare. Since the errors stopped at some point, we figured that the drive itself was not at fault. We offlined it, zeroed it and onlined it again, started resilvering, and manually detached the spare drive. The zpool status is ONLINE and mentions
2010 Dec 05
4
Zfs ignoring spares?
Hi all I have installed a new server with 77 2TB drives in 11 7-drive RAIDz2 VDEVs, all on WD Black drives. Now, it seems two of these drives were bad, one of them had a bunch of errors, the other was very slow. After zfs offlining these and then zfs replacing them with online spares, resilver ended and I thought it''d be ok. Appearently not. Albeit the resilver succeeds, the pool status
2009 Jul 13
7
OpenSolaris 2008.11 - resilver still restarting
Just look at this. I thought all the restarting resilver bugs were fixed, but it looks like something odd is still happening at the start: Status immediately after starting resilver: # zpool status pool: rc-pool state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine
2010 Apr 24
3
ZFS RAID-Z2 degraded vs RAID-Z1
Had an idea, could someone please tell me why it''s wrong? (I feel like it has to be). A RaidZ-2 pool with one missing disk offers the same failure resilience as a healthy RaidZ1 pool (no data loss when one disk fails). I had initially wanted to do single parity raidz pool (5disk), but after a recent scare decided raidz2 was the way to go. With the help of a sparse file
2013 Mar 23
0
Dirves going offline in Zpool
Hi, I have Dell md1200 connected to two heads ( Dell R710 ). The heads have Perc H800 card and drives are configured in Raid0 ( Virtual Disk) in the RAID controller. One of the drives had crashed and is replaced by a spare. Resilvering was triggered but fails to complete due to drives going offline. I have to reboot the head ( R710) and drives comes online. This happened repeatedly when
2013 Oct 15
0
How to unstick ZFS resilver?
I have a large (88-drive) zpool in which a drive was recently replaced. (The pool has a bunch of duff Toshiba MK2001TRKB drives -- never ever pay money for these! -- and I'm trying to replace them one by one before they fail completely.) The resilver on the first drive replacement has been taking much much too long, and currently it's stuck in this state: pool: export state: DEGRADED
2010 Mar 17
0
checksum errors increasing on "spare" vdev?
Hi, One of my colleagues was confused by the output of ''zpool status'' on a pool where a hot spare is being resilvered in after a drive failure: $ zpool status data pool: data state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub:
2010 Sep 29
7
Is there any way to stop a resilver?
Is there any way to stop a resilver? We gotta stop this thing - at minimum, completion time is 300,000 hours, and maximum is in the millions. Raidz2 array, so it has the redundancy, we just need to get data off. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100929/9dbb6cf5/attachment.html>
2009 Mar 30
3
Data corruption during resilver operation
I''m in well over my head with this report from zpool status saying: root # zpool status z3 pool: z3 state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A