Hello list, My ZFS pool has found it''s way into a bad state after a period of neglect and now I''m having trouble recovering. The pool is a three-way mirror of which 2 disks started showing errors and thus the pool was degraded. I shut down the system and started at the lowest level by using ES Tool (Samsung) to do a diagnostic. Sure enough the 2 disks were showing bad sectors. After a low level format I attempted to reintroduce these disks back into the mirror. However, when resilvering the system would hang/freeze at about 50% and I needed to reset the system. My next attempt was to just leave the single good disk in the system (detach mirrors) and attempt a scrub. Again the system hangs at 50%. Final attempt was to just try and copy the data to a new pool using a ''cp -R''. Ran great for some time but the copy did not complete. It hung just like resilver and scrub. The good disk still comes through the Samsung (full) diagnostic with no issues found. I''m not sure what to do next. Is my final pool completely lost? I''ll try other hardware (power supply, memory) next... FYI: I''m using OpenSolaris Nevada (76 I believe) but have also tried the OpenSolaris 2008.05 Live CD. Regards, - Emiel
On Tue, 29 Jul 2008, Emiel van de Laar wrote:> > I''m not sure what to do next. Is my final pool completely lost?It sounds like your "good" disk has some serious problems and that formatting the two disks with bad sectors was the wrong thing to do. You might have been able to recover using the two failing disks by removing the "good" disk which was causing the hang. Since diagnostics on the "good" disk succeeded, there may still be some hope by using a low-level tool like ''dd'' to transfer the underlying data to more reliable storage. If there is a successful transfer, then you have something to work with. Bob =====================================Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Bob Friesenhahn wrote:> On Tue, 29 Jul 2008, Emiel van de Laar wrote: > >> I''m not sure what to do next. Is my final pool completely lost? >> > > It sounds like your "good" disk has some serious problems and that > formatting the two disks with bad sectors was the wrong thing to do. > You might have been able to recover using the two failing disks by > removing the "good" disk which was causing the hang. > > Since diagnostics on the "good" disk succeeded, there may still be > some hope by using a low-level tool like ''dd'' to transfer the > underlying data to more reliable storage. If there is a successful > transfer, then you have something to work with. >Good idea. This eliminates ZFS from the equation and should verify that the data is readable. Also check for errors or faults discovered by FMA using fmdump. If these are consumer grade disks, they may not return failure when an unrecoverable read is attempted, but the request should timeout eventually. You may be seeing serial timeouts which should show up in the FMA records. -- richard