Laurent Blume
2009-Jul-13 06:08 UTC
[zfs-discuss] Can''t offline a RAID-Z2 device: "no valid replica"
(As I''m not subscribed to this list, you can keep me in CC:, but I''ll check out the Jive thread) Hi all, I''ve seen this questions asked several times, but there wasn''t any solution provided. I''m trying to offline a faulted device in a RAID-Z2 device on Solaris 10. This is done according to the documentation: http://docs.sun.com/app/docs/doc/819-5461/gazfy?l=en&a=view However, I always get the same message: cannot offline c2t1d0: no valid replicas How come? It''s a RAID-Z2 pool, it should (and it does) work fine without one device. What am I doing wrong? TIA! # zpool status data pool: data state: DEGRADED status: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the faulted device, or use ''zpool clear'' to mark the device repaired. scrub: none requested config: NAME STATE READ WRITE CKSUM data DEGRADED 0 0 0 raidz2 DEGRADED 0 0 0 c2t0d0 ONLINE 0 0 0 c2t1d0 FAULTED 3 636 1 too many errors c2t2d0 ONLINE 0 0 0 c2t3d0 ONLINE 0 0 0 errors: No known data errors # zpool offline -t data c2t1d0 cannot offline c2t1d0: no valid replicas Laurent -- This message posted from opensolaris.org
Ross
2009-Jul-13 12:29 UTC
[zfs-discuss] Can''t offline a RAID-Z2 device: "no valid replica"
Yup, just hit exactly the same myself. I have a feeling this faulted disk is affecting performance, so tried to remove or offline it: $ zpool iostat -v 30 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- rc-pool 1.27T 1015G 682 71 84.0M 1.88M mirror 199G 265G 0 5 0 21.1K c4t1d0 - - 0 2 0 21.1K c4t2d0 - - 0 0 0 0 c5t1d0 - - 0 2 0 21.1K mirror 277G 187G 170 7 21.1M 322K c4t3d0 - - 58 4 7.31M 322K c5t2d0 - - 54 4 6.83M 322K c5t0d0 - - 56 4 6.99M 322K mirror 276G 188G 171 6 21.1M 336K c5t3d0 - - 56 4 7.03M 336K c4t5d0 - - 56 3 7.03M 336K c4t4d0 - - 56 3 7.04M 336K mirror 276G 188G 169 6 20.9M 353K c5t4d0 - - 57 3 7.17M 353K c5t5d0 - - 54 4 6.79M 353K c4t6d0 - - 55 3 6.99M 353K mirror 277G 187G 171 10 20.9M 271K c4t7d0 - - 56 4 7.11M 271K c5t6d0 - - 55 5 6.93M 271K c5t7d0 - - 55 5 6.88M 271K c6d1p0 32K 504M 0 34 0 620K ---------- ----- ----- ----- ----- ----- ----- 20MB in 30 seconds for 3 disks.... that''s 220kb/s. Not healthy at all. $ zpool status pool: rc-pool state: DEGRADED status: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the faulted device, or use ''zpool clear'' to mark the device repaired. see: http://www.sun.com/msg/ZFS-8000-K4 scrub: scrub completed after 2h55m with 0 errors on Tue Jun 23 11:11:42 2009 config: NAME STATE READ WRITE CKSUM rc-pool DEGRADED 0 0 0 mirror DEGRADED 0 0 0 c4t1d0 ONLINE 0 0 0 c4t2d0 FAULTED 1.71M 23.3M 0 too many errors c5t1d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c4t3d0 ONLINE 0 0 0 c5t2d0 ONLINE 0 0 0 c5t0d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t3d0 ONLINE 0 0 0 c4t5d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t4d0 ONLINE 0 0 0 c5t5d0 ONLINE 0 0 0 c4t6d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c4t7d0 ONLINE 0 0 0 c5t6d0 ONLINE 0 0 0 c5t7d0 ONLINE 0 0 0 logs DEGRADED 0 0 0 c6d1p0 ONLINE 0 0 0 errors: No known data errors # zpool offline rc-pool c4t2d0 cannot offline c4t2d0: no valid replicas # zpool remove rc-pool c4t2d0 cannot remove c4t2d0: only inactive hot spares or cache devices can be removed -- This message posted from opensolaris.org
Thomas Liesner
2009-Jul-15 09:12 UTC
[zfs-discuss] Can''t offline a RAID-Z2 device: "no valid replica"
You can''t replace it because this disk is still a valid member of the pool, although it is marked faulty. Put in a replacement disk, add this to the pool and replace the faulty one with the new disk. Regards, Tom -- This message posted from opensolaris.org
Laurent Blume
2009-Jul-15 14:32 UTC
[zfs-discuss] Can''t offline a RAID-Z2 device: "no valid replica"
I don''t have a replacement, but I don''t want the disk to be used right now by the volume: how do I do that? This is exactly the point of the offline command as explained in the documentation: disabling unreliable hardware, or removing it temporarily. So this is a huge bug of the documentation? What''s the point of it if its own purpose doesn''t work? I''m really puzzled now. Laurent -- This message posted from opensolaris.org
Thomas Liesner
2009-Jul-15 15:53 UTC
[zfs-discuss] Can''t offline a RAID-Z2 device: "no valid replica"
You could offline the disk if [b]this[/b] disk (not the pool) had a replica. Nothing wrong with the documentation. Hmm, maybe it is little misleading here. I walked into the same "trap". The pool is not using the disk anymore anyway, so (from the zfs point of view) there is no need to offline the disk. If you want to stop the io-system from trying to access the disk, pull it out or wait until it gives up... -- This message posted from opensolaris.org
Laurent Blume
2009-Jul-16 07:39 UTC
[zfs-discuss] Can''t offline a RAID-Z2 device: "no valid replica"
> You could offline the disk if [b]this[/b] disk (not > the pool) had a replica. Nothing wrong with the > documentation. Hmm, maybe it is little misleading > here. I walked into the same "trap".I apologize for being daft here, but I don''t find any ambiguity in the documentation. This is explicitly stated as being possible. "This scenario is possible assuming that the systems in question see the storage once it is attached to the new switches, possibly through different controllers than before, and your pools are set up as RAID-Z or mirrored configurations." And lower, it even says that it''s not possible to offline two devices in a RAID-Z, with that exact error as an example: "You cannot take a pool offline to the point where it becomes faulted. For example, you cannot take offline two devices out of a RAID-Z configuration, nor can you take offline a top-level virtual device. # zpool offline tank c1t0d0 cannot offline c1t0d0: no valid replicas " http://docs.sun.com/app/docs/doc/819-5461/gazgm?l=en&a=view I don''t understand what you mean by this disk not having a replica. It''s RAID-Z2: by definition, all the data it contains is replicated on two other disks in the pool. That''s why the pool is still working fine.> The pool is not using the disk anymore anyway, so > (from the zfs point of view) there is no need to > offline the disk. If you want to stop the io-system > from trying to access the disk, pull it out or wait > until it gives up...Yes, there is. I don''t want the disk to become online if the system reboots, because what actually happens is that it *never* gives up (well, at least not in more than 24 hours), and all I/O to the zpool stop as long as there are those errors. Yes, I know it should continue working. In practice, it does not (though it used to be much worse in previous versions of S10, with all I/O stopping on all disks and volumes, both ZFS and UFS, and usually ending in a panic). And the zpool command hangs, and never finished. The only way to get out of it is to use cfgadm to send multiple hardware resets to the SATA device, then disconnect it. At this point, zpool completes and shows the disk as having faulted. Laurent -- This message posted from opensolaris.org
Thomas Liesner
2009-Jul-16 09:47 UTC
[zfs-discuss] Can''t offline a RAID-Z2 device: "no valid replica"
You''re right, from the documentation it definitely should work. Still, it doesn''t. At least not in Solaris 10. But i am not a zfs-developer, so this should probably answered by them. I will give it a try with a recent OpneSolaris-VM and check, wether this works in newer implementations of zfs.> > The pool is not using the disk anymore anyway, so > > (from the zfs point of view) there is no need to > > offline the disk. If you want to stop the > io-system > > from trying to access the disk, pull it out or > wait > > until it gives up... > > Yes, there is. I don''t want the disk to become online > if the system reboots, because what actually happens > is that it *never* gives up (well, at least not in > more than 24 hours), and all I/O to the zpool stop as > long as there are those errors. Yes, I know it should > continue working. In practice, it does not (though it > used to be much worse in previous versions of S10, > with all I/O stopping on all disks and volumes, both > ZFS and UFS, and usually ending in a panic). > And the zpool command hangs, and never finished. The > only way to get out of it is to use cfgadm to send > multiple hardware resets to the SATA device, then > disconnect it. At this point, zpool completes and > shows the disk as having faulted.Again you are right, that this is a very annoying behaviour. the same thing happens with DiskSuite pools and ufs when a disk is failing as well, though. For me it is not a zfs problem, but a Solaris problem. The kernel should stop trying to access failing disks a LOT earlier instead of blocking the complete I/O for the whole system. I always understood zfs as a concept for hot pluggable disks. This is the way i use it and that is why i never really had this problem. Whenever i run into this behaviour, i simply pull the disk in question and replace it. The time those "hickups" affect the performance of our production eviroment have never been longer than a couple of minutes. Tom -- This message posted from opensolaris.org
Thomas Liesner
2009-Jul-16 13:05 UTC
[zfs-discuss] Can''t offline a RAID-Z2 device: "no valid replica"
FYI: In b117 it works as expected and stated in the documentation. Tom -- This message posted from opensolaris.org
Ross
2009-Jul-16 15:08 UTC
[zfs-discuss] Can''t offline a RAID-Z2 device: "no valid replica"
Great news, thanks Tom! -- This message posted from opensolaris.org
Cindy.Swearingen at Sun.COM
2009-Jul-17 17:15 UTC
[zfs-discuss] Can''t offline a RAID-Z2 device: "no valid replica"
Hi Laurent, Yes, you should able to offline a faulty device in a redundant configuration as long as enough devices are available to keep the pool redundant. On my Solaris Nevada system (latest bits), injecting a fault into a disk in a RAID-Z configuration and then offlining a disk works as expected. On my Solaris 10 system, I''m unable to offline a faulted disk in a RAID-Z configuration so I will get back to you with a bug ID or some other plausible explanation. Thanks for reporting this problem. Cindy Laurent Blume wrote:>>You could offline the disk if [b]this[/b] disk (not >>the pool) had a replica. Nothing wrong with the >>documentation. Hmm, maybe it is little misleading >>here. I walked into the same "trap". > > > I apologize for being daft here, but I don''t find any ambiguity in the documentation. > This is explicitly stated as being possible. > > "This scenario is possible assuming that the systems in question see the storage once it is attached to the new switches, possibly through different controllers than before, and your pools are set up as RAID-Z or mirrored configurations." > > And lower, it even says that it''s not possible to offline two devices in a RAID-Z, with that exact error as an example: > > "You cannot take a pool offline to the point where it becomes faulted. For example, you cannot take offline two devices out of a RAID-Z configuration, nor can you take offline a top-level virtual device. > > # zpool offline tank c1t0d0 > cannot offline c1t0d0: no valid replicas > " > > http://docs.sun.com/app/docs/doc/819-5461/gazgm?l=en&a=view > > I don''t understand what you mean by this disk not having a replica. It''s RAID-Z2: by definition, all the data it contains is replicated on two other disks in the pool. That''s why the pool is still working fine. > > >>The pool is not using the disk anymore anyway, so >>(from the zfs point of view) there is no need to >>offline the disk. If you want to stop the io-system >>from trying to access the disk, pull it out or wait >>until it gives up... > > > Yes, there is. I don''t want the disk to become online if the system reboots, because what actually happens is that it *never* gives up (well, at least not in more than 24 hours), and all I/O to the zpool stop as long as there are those errors. Yes, I know it should continue working. In practice, it does not (though it used to be much worse in previous versions of S10, with all I/O stopping on all disks and volumes, both ZFS and UFS, and usually ending in a panic). > And the zpool command hangs, and never finished. The only way to get out of it is to use cfgadm to send multiple hardware resets to the SATA device, then disconnect it. At this point, zpool completes and shows the disk as having faulted. > > > Laurent
Laurent Blume
2009-Jul-20 14:33 UTC
[zfs-discuss] Can''t offline a RAID-Z2 device: "no valid replica"
> You''re right, from the documentation it definitely > should work. Still, it doesn''t. At least not in > Solaris 10. But i am not a zfs-developer, so this > should probably answered by them. I will give it a > try with a recent OpneSolaris-VM and check, wether > this works in newer implementations of zfs.Thanks for confirming that it does work in b117. I''m not able to test it easily at the moment.> Again you are right, that this is a very annoying > behaviour. the same thing happens with DiskSuite > pools and ufs when a disk is failing as well, though. > For me it is not a zfs problem, but a Solaris > problem. The kernel should stop trying to access > failing disks a LOT earlier instead of blocking the > complete I/O for the whole system.I think it''s both. At least, it used to be very much on the ZFS side. My understanding is that it has been improved to handle better issues reported by the driver. But now, those thousands of retries in the logs are pretty much useless. The system should indeed provide a way to automatically isolate such a disk, which could trigger a ZFS panic if it makes a zpool faulted, but ZFS does handle such cases. I understand it''s not an easy task ;-)> I always understood zfs as a concept for hot > pluggable disks. This is the way i use it and that is > why i never really had this problem. Whenever i run > into this behaviour, i simply pull the disk in > question and replace it. The time those "hickups" > affect the performance of our production eviroment > have never been longer than a couple of minutes.Ah, that''s basically what I''m doing remotely with cfgadm. I''m a few thousands kilometers away from those disks, and worse, I was cheap at the time, I didn''t buy an enclosure with removable drives. Well, I didn''t expect so many issues, I''ve had some bad luck with it from the beginning. Laurent -- This message posted from opensolaris.org
Laurent Blume
2009-Jul-20 14:36 UTC
[zfs-discuss] Can''t offline a RAID-Z2 device: "no valid replica"
Thanks a lot, Cindy! Let me know how it goes or if I can provide more info. Part of the bad luck I''ve had with that set, is that it reports such errors about once a month, then everything goes back to normal again. So I''m pretty sure that I''ll be able to try to offline the disk someday. Laurent -- This message posted from opensolaris.org
Cindy.Swearingen at Sun.COM
2009-Jul-27 23:05 UTC
[zfs-discuss] Can''t offline a RAID-Z2 device: "no valid replica"
Hi Laurent, I was able to reproduce on it on a Solaris 10 5/09 system. The problem is fixed in a current Nevada bits and also in the upcoming Solaris 10 release. The bug fix that integrated this change might be this one: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6328632 zpool offline is a bit too conservative I can understand that you would want to offline a faulty disk. In the meantime, you might use fmdump to help isolate the transient error problems. Thanks, Cindy On 07/20/09 08:36, Laurent Blume wrote:> Thanks a lot, Cindy! > > Let me know how it goes or if I can provide more info. > Part of the bad luck I''ve had with that set, is that it reports such errors about once a month, then everything goes back to normal again. So I''m pretty sure that I''ll be able to try to offline the disk someday. > > Laurent