I just got a call from another of our admins, as I am the resident ZFS expert, and they have opened a support case with Oracle, but I figured I''d ask here as well, as this forum often provides better, faster answers :-) We have a server (M4000) with 6 FC attached SE-3511 disk arrays (some behind a 6920 DSP engine). There are many LUNs, all about 500 GB and mirrored via ZFS. The LUNs from one tray from one of the direct attached 3511''s went offline this morning. At that point 1/3 of the LUNs from this one array were affected and UNAVAIL. We had sufficient unused LUNs in the right places to substitute for the 9 failed LUNs, so I started a zpool replace to an unused LUN on another array. At this point we had the other 2/3 of the LUNs from this 3511 go offline. So one entire 3511 out of six was offline. No data loss, no major issues as everything is mirrored across arrays. Here is where the real problem starts. In order to get the failed 3511 back online, we did a cold start of it (this is actually a known failure mode, one we had not seen in over a year, where one drive failing takes out an entire tray in the 3511, and a large part of why we are using ZFS to mirror across 3511 RAID arrays). This brought all the LUNs from this 3511 back (although we have not yet done the zpool clear or online to bring all the vdevs back on line yet). Unfortunately, at some point in here on the vdev that was resilvering, the remaining good device tossed errors (probably transient FC errors as the restarting 3511 logged back onto the fabric), but that caused ZFS to mark that device as UNAVAIL. We tested access to that LUN with dd and we can read data from it. We tried to zpool online this device, but the zpool command has not returned for over 5 minutes. Is there a way (other than zpool online) to kick ZFS into rescanning the LUNs ? ---or--- Are we going to have to export the zpool and then import it ? I wanted to get opinions here while the folks on site are running this past Oracle Support. I am NOT on site and can''t pull config info for the faulted zpool, but what I recall is that is is composed of 11 mirror vdevs, each about 500 GB. Only one of the vdevs is FAULTED (the one that was resilvering), but two or three others have devices that are UNAVAIL or FAULTED (but the vdev is degraded and not faulted). If I had realized the entire 3511 array had gone away and that we would be restarting it, I would NOT have attempted to replace the faulted LUN and we would probably be OK. Needless to say we really don''t want to have to restore the data from the backup (a ZFS send / recv replica at the other end of a 100 Mbps pipe), but we can if we cannot recover the data in the zpool. So what is the best option here ? And why isn''t the zpool online returning ? The system is running 10U9 with (I think) the September 2010 CPU and a couple multipathing / SAS / SATA point patches (for a MPxIO and SATA bug we found). zpool version is 22 zfs version is either 4 or 5 (I forget which). We are moving off of the 3511s and onto a stack of five J4400 with 750 GB SATA drives, but we aren''t there yet :-( P.S. The other zpools on the box are still up and running. The ones that had deviceson the faulted 3511 are degraded but online, the ones that did not have devices on the faulted 3511 are OK. Because of these other zpools we can''t really reboot the box or pull the FC connections. -- {--------1---------2---------3---------4---------5---------6---------7---------} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players
On May 19, 2011, at 2:09 PM, Paul Kraus <paul at kraus-haus.org> wrote:> I just got a call from another of our admins, as I am the resident ZFS > expert, and they have opened a support case with Oracle, but I figured > I''d ask here as well, as this forum often provides better, faster > answers :-) > > We have a server (M4000) with 6 FC attached SE-3511 disk arrays > (some behind a 6920 DSP engine). There are many LUNs, all about 500 GB > and mirrored via ZFS. The LUNs from one tray from one of the direct > attached 3511''s went offline this morning. At that point 1/3 of the > LUNs from this one array were affected and UNAVAIL. We had sufficient > unused LUNs in the right places to substitute for the 9 failed LUNs, > so I started a zpool replace to an unused LUN on another array. At > this point we had the other 2/3 of the LUNs from this 3511 go offline. > So one entire 3511 out of six was offline. No data loss, no major > issues as everything is mirrored across arrays. > > Here is where the real problem starts. In order to get the failed > 3511 back online, we did a cold start of it (this is actually a known > failure mode, one we had not seen in over a year, where one drive > failing takes out an entire tray in the 3511, and a large part of why > we are using ZFS to mirror across 3511 RAID arrays). This brought all > the LUNs from this 3511 back (although we have not yet done the zpool > clear or online to bring all the vdevs back on line yet). > Unfortunately, at some point in here on the vdev that was resilvering, > the remaining good device tossed errors (probably transient FC errors > as the restarting 3511 logged back onto the fabric), but that caused > ZFS to mark that device as UNAVAIL. We tested access to that LUN with > dd and we can read data from it. We tried to zpool online this device, > but the zpool command has not returned for over 5 minutes. > > Is there a way (other than zpool online) to kick ZFS into > rescanning the LUNs ?zpool clear poolname> > ---or--- > > Are we going to have to export the zpool and then import it ? > > I wanted to get opinions here while the folks on site are running > this past Oracle Support. I am NOT on site and can''t pull config info > for the faulted zpool, but what I recall is that is is composed of 11 > mirror vdevs, each about 500 GB. Only one of the vdevs is FAULTED (the > one that was resilvering), but two or three others have devices that > are UNAVAIL or FAULTED (but the vdev is degraded and not faulted). > > If I had realized the entire 3511 array had gone away and that we > would be restarting it, I would NOT have attempted to replace the > faulted LUN and we would probably be OK.yes> Needless to say we really don''t want to have to restore the data > from the backup (a ZFS send / recv replica at the other end of a 100 > Mbps pipe), but we can if we cannot recover the data in the zpool. > > So what is the best option here ? > > And why isn''t the zpool online returning ? The system is running > 10U9 with (I think) the September 2010 CPU and a couple multipathing / > SAS / SATA point patches (for a MPxIO and SATA bug we found). zpool > version is 22 zfs version is either 4 or 5 (I forget which). We are > moving off of the 3511s and onto a stack of five J4400 with 750 GB > SATA drives, but we aren''t there yet :-(The ZFS Administration Guide has a pretty good description of the troubleshooting methods and description of the zpool clear command.> P.S. The other zpools on the box are still up and running. The ones > that had deviceson the faulted 3511 are degraded but online, the ones > that did not have devices on the faulted 3511 are OK. Because of these > other zpools we can''t really reboot the box or pull the FC > connections.Reboot isn''t needed, this isn''t a PeeCee :-) -- richard
On Fri, May 20, 2011 at 12:53 AM, Richard Elling <richard.elling at gmail.com> wrote:> On May 19, 2011, at 2:09 PM, Paul Kraus <paul at kraus-haus.org> wrote:>> ? ?Is there a way (other than zpool online) to kick ZFS into >> rescanning the LUNs ? > > zpool clear poolnameI am unclear on when clear is the right command vs online. I have not gotten consistent information from Oracle. Can Richard (or someone else) please summarize here, thanks. <snip>>> ? ?If I had realized the entire 3511 array had gone away and that we >> would be restarting it, I would NOT have attempted to replace the >> faulted LUN and we would probably be OK. > > yesYeah, hindsight and all that. But at the moment I hit return on the zpool replace we still only had one of three trays faulted on the 3511 ... sigh. <snip>>> P.S. The other zpools on the box are still up and running. The ones >> that had deviceson the faulted 3511 are degraded but online, the ones >> that did not have devices on the faulted 3511 are OK. Because of these >> other zpools we can''t really reboot the box or pull the FC >> connections. > > Reboot isn''t needed, this isn''t a PeeCee :-)Oracle support recommended a reboot (which did clear the ZFS issue). I was not at the office to try to get a better solution out of Oracle. Now this morning, the original tray in the 3511 that failed is offline again, but .... this time it is not the bug we have run into, but a genuine failure of more than one drive in a RAID set. So now I am zpool replacing the faulted LUNs (and have asked that no one reboot any 3511''s until I am done :-) -- {--------1---------2---------3---------4---------5---------6---------7---------} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players