Jim Hranicky
2006-Nov-21 19:28 UTC
[zfs-discuss] zfs hot spare not automatically getting used
OS: Nevada build 51 x86 I recently upgraded Sol10x86 6/6 to Nevada build 51. I''m testing out zfs on a machine and set up a pool with a mirror of two drives and two hot spares. I then spun down a drive in the mirror which caused the machine to hang, so I rebooted the host. After a reboot, the mirror came up in degraded mode but neither of the spares were automatically used. Is there something I need to tweak to get this to work? This message posted from opensolaris.org
Sanjeev Bagewadi
2006-Nov-22 05:55 UTC
[zfs-discuss] zfs hot spare not automatically getting used
Jim, We did hit similar issue yesterday on build 50 and build 45 although the node did not hang. In one of the cases we saw that the hot spare was not of the same size... can you check if this true ? Do you have a threadlist from the node when it was hung ? That would reveal some info. Thanks and regards, Sanjeev. Jim Hranicky wrote:>OS: Nevada build 51 x86 > >I recently upgraded Sol10x86 6/6 to Nevada build 51. I''m testing out zfs >on a machine and set up a pool with a mirror of two drives and two hot >spares. I then spun down a drive in the mirror which caused the machine >to hang, so I rebooted the host. After a reboot, the mirror came up in >degraded mode but neither of the spares were automatically used. > >Is there something I need to tweak to get this to work? > > >This message posted from opensolaris.org >_______________________________________________ >zfs-discuss mailing list >zfs-discuss at opensolaris.org >http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >-- Solaris Revenue Products Engineering, India Engineering Center, Sun Microsystems India Pvt Ltd. Tel: x27521 +91 80 669 27521
James F. Hranicky
2006-Nov-27 20:52 UTC
[zfs-discuss] zfs hot spare not automatically getting used
[ Sorry, this bounced the first time so I subscribed to the list ] Sanjeev Bagewadi wrote:>Jim, >> >We did hit similar issue yesterday on build 50 and build 45 although the >node did not hang. >In one of the cases we saw that the hot spare was not of the same >size... can you check >if this true ?It looks like they''re all slightly different sizes.>Do you have a threadlist from the node when it was hung ? That would >reveal some info.Unfortunately I don''t. Do you mean the output of ::threadlist -v from mdb -k ? I can try the whole process again and see what I get. Jim
Jim Hranicky
2006-Nov-28 19:22 UTC
[zfs-discuss] Re: zfs hot spare not automatically getting used
So is there a command to make the spare get used, or so I have to remove it as a spare and add it if it doesn''t get automatically used? Is this a bug to be fixed, or will this always be the case when the disks aren''t exactly the same size? This message posted from opensolaris.org
Sanjeev Bagewadi
2006-Nov-29 04:23 UTC
[zfs-discuss] zfs hot spare not automatically getting used
Jim, James F. Hranicky wrote:>Sanjeev Bagewadi wrote: > > >>Jim, >> >>We did hit similar issue yesterday on build 50 and build 45 although the >>node did not hang. >>In one of the cases we saw that the hot spare was not of the same >>size... can you check >>if this true ? >> >> > >It looks like they''re all slightly different sizes. > >Interestingly during our demo runs at the recent FOSS event (http://foss.in) we had no issues with this (snv build 45). We had a RAIDZ config of 3 disks and 1 spare disk. And what we found was that the spare kicked in. Here is how we tried it : - Plugged out one of the 3 disks - Kicked of a write to the FS on the pool (ie. dd to a new file in the FS). - The spare kicked in after a while. I guess there is some delay in the detection. I am not sure if there is some threshold beyond which it kicks in. Need to check the code for this.> > >>Do you have a threadlist from the node when it was hung ? That would >>reveal some info. >> >> > >Unfortunately I don''t. Do you mean the output of > > ::threadlist -v > >Yes. That would be useful. Also, check the zpool status output.>from > > mdb -k > >Run the following : # echo "::threadlist -v" | mdb -k > /var/tmp/threadlist.out Regards, Sanjeev. -- Solaris Revenue Products Engineering, India Engineering Center, Sun Microsystems India Pvt Ltd. Tel: x27521 +91 80 669 27521
Jim Hranicky
2006-Nov-29 20:11 UTC
[zfs-discuss] Re: zfs hot spare not automatically getting used
> >>Do you have a threadlist from the node when it was > hung ? That would > >>reveal some info. > > > >Unfortunately I don''t. Do you mean the output of > > > > ::threadlist -v > > > Yes. That would be useful.OK, spun down the drives again. Here''s that output: http://www.cise.ufl.edu/~jfh/zfs/threads here''s the output after boot: http://www.cise.ufl.edu/~jfh/zfs/threads-after-boot> Also, check the zpool > status output.This hangs and is unkillable. The node also has to be powercycled as it hangs on a reboot. Until the boot it seems to work ok, though it spits out a ton of SCSI errors. This message posted from opensolaris.org
Jim Hranicky
2006-Nov-29 20:58 UTC
[zfs-discuss] Re: zfs hot spare not automatically getting used
I know this isn''t necessarily ZFS specific, but after I reboot I spin the drives back up, but nothing I do (devfsadm, disks, etc) can get them seen again until the next reboot. I''ve got some older scsi drives in an old Andataco Gigaraid enclosure which I thought supported hot-swap, but I seem unable to hot swap them in. The PC has an adaptec 39160 card in it and I''m running Nevada b51. Is this not a setup that can support hot swap? Or is there something I have to do other than devfsadm to get the scsi bus rescanned? This message posted from opensolaris.org
Jim Hranicky
2006-Nov-29 21:39 UTC
[zfs-discuss] Re: zfs hot spare not automatically getting used
> > OK, spun down the drives again. Here''s that output: > > http://www.cise.ufl.edu/~jfh/zfs/threadsI just realized that I changed the configuration, so that doesn''t reflect a system with spares, sorry. However, I reinitialized the pool and spun down one of the drives and everything is working as it should: pool: zmir state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using ''zpool online''. see: http://www.sun.com/msg/ZFS-8000-D3 scrub: resilver completed with 0 errors on Wed Nov 29 16:29:53 2006 config: NAME STATE READ WRITE CKSUM zmir DEGRADED 0 0 0 mirror DEGRADED 0 0 0 c0t0d0 ONLINE 0 0 0 spare DEGRADED 0 0 0 c3t1d0 UNAVAIL 10 28.88 0 cannot open c3t3d0 ONLINE 0 0 0 spares c3t3d0 INUSE currently in use c3t4d0 AVAIL errors: No known data errors I''m just not sure if it will always work. I''ll try a few different configs and see what happens. This message posted from opensolaris.org
Sanjeev Bagewadi
2006-Nov-30 04:24 UTC
[zfs-discuss] Re: zfs hot spare not automatically getting used
Jim, That is good news !! Let''s us know how it goes. Regards, Sanjeev. PS : I am out of office a couple of days. Jim Hranicky wrote:>>OK, spun down the drives again. Here''s that output: >> >> http://www.cise.ufl.edu/~jfh/zfs/threads >> >> > >I just realized that I changed the configuration, so that doesn''t reflect >a system with spares, sorry. > >However, I reinitialized the pool and spun down one of the drives and >everything is working as it should: > > pool: zmir > state: DEGRADED > status: One or more devices could not be opened. Sufficient replicas exist for > the pool to continue functioning in a degraded state. > action: Attach the missing device and online it using ''zpool online''. > see: http://www.sun.com/msg/ZFS-8000-D3 > scrub: resilver completed with 0 errors on Wed Nov 29 16:29:53 2006 > config: > > NAME STATE READ WRITE CKSUM > zmir DEGRADED 0 0 0 > mirror DEGRADED 0 0 0 > c0t0d0 ONLINE 0 0 0 > spare DEGRADED 0 0 0 > c3t1d0 UNAVAIL 10 28.88 0 cannot open > c3t3d0 ONLINE 0 0 0 > spares > c3t3d0 INUSE currently in use > c3t4d0 AVAIL > > errors: No known data errors > >I''m just not sure if it will always work. > >I''ll try a few different configs and see what happens. > > >This message posted from opensolaris.org >_______________________________________________ >zfs-discuss mailing list >zfs-discuss at opensolaris.org >http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >-- Solaris Revenue Products Engineering, India Engineering Center, Sun Microsystems India Pvt Ltd. Tel: x27521 +91 80 669 27521