Hi list, I''ve got a system with 3 WD and 3 seagate drives. Today I got an email that zpool status indicated one of the seagate drives as REMOVED. I''ve tried clearing the error but the pool becomes faulted again. Taken out the offending drive and plugged into a windows box with seatools install. Unfortunately seatools finds nothing wrong with the drive. Windows seems to see the drive details ok, of course I can''t try anything ZFS related. Is it worth RMAing to seagate anyway (considering they will apparently charge me if they don''t think the drive is faulty) or are there some other tests I can try? I''ve got the system powered down as there wasn''t room to install hot spares, and I don''t want to risk the rest of the pool with another failure. Any tips appreciated. Thanks
On Sun, Sep 11, 2011 at 11:41:32AM +0100, Matt Harrison wrote: Hi,> > I''ve got a system with 3 WD and 3 seagate drives. Today I got an email > that zpool status indicated one of the seagate drives as REMOVED. > > I''ve tried clearing the error but the pool becomes faulted again. Taken > out the offending drive and plugged into a windows box with seatools > install. Unfortunately seatools finds nothing wrong with the drive.Wondering, which OS version, driver and which controller? Also, is this always the 2nd drive of a 2-way mirror? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
On Sep 11, 2011, at 3:41 AM, Matt Harrison <iwasinnamuknow at genestate.com> wrote:> Hi list, > > I''ve got a system with 3 WD and 3 seagate drives. Today I got an email that zpool status indicated one of the seagate drives as REMOVED.The removed state can be the result of a transport issue. If this is a Solaris-based OS, then look at "fmadm faulty" for a diagnosis leading to a removal. If none, then look at "fmdump -eV" for errors relating to the disk. Last, check the "zpool history" to make sure one of those little imps didn''t issue a "zpool remove" command. -- richard> I''ve tried clearing the error but the pool becomes faulted again. Taken out the offending drive and plugged into a windows box with seatools install. Unfortunately seatools finds nothing wrong with the drive. > > Windows seems to see the drive details ok, of course I can''t try anything ZFS related. > > Is it worth RMAing to seagate anyway (considering they will apparently charge me if they don''t think the drive is faulty) or are there some other tests I can try? > > I''ve got the system powered down as there wasn''t room to install hot spares, and I don''t want to risk the rest of the pool with another failure. > > Any tips appreciated. > > Thanks > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Sep 11, 2011, at 13:01 , Richard Elling wrote:> The removed state can be the result of a transport issue. If this is a Solaris-based > OS, then look at "fmadm faulty" for a diagnosis leading to a removal. If none, > then look at "fmdump -eV" for errors relating to the disk. Last, check the "zpool > history" to make sure one of those little imps didn''t issue a "zpool remove" > command.Definitely check your cabling; a few of my drives disappeared like this as ''REMOVED'', turned out to be some loose SATA cables on my backplane. --khd
On 11/09/2011 18:32, Krunal Desai wrote:> On Sep 11, 2011, at 13:01 , Richard Elling wrote: >> The removed state can be the result of a transport issue. If this is a Solaris-based >> OS, then look at "fmadm faulty" for a diagnosis leading to a removal. If none, >> then look at "fmdump -eV" for errors relating to the disk. Last, check the "zpool >> history" to make sure one of those little imps didn''t issue a "zpool remove" >> command. > > Definitely check your cabling; a few of my drives disappeared like this as ''REMOVED'', turned out to be some loose SATA cables on my backplane. > > --khdThanks guys, I reinstalled the drive after testing on the windows machine and it looks fine now. By the time I''d got on to the console it had already started resilvering. All done now and hopefully it will stay like that for a while. Thanks again, saved me some work
It''d be worth still reseating the SATA cables on the backplane like Krunal recommended. Once the resilvering completes, of course ;) -- Sriram On 9/12/11, Matt Harrison <iwasinnamuknow at genestate.com> wrote:> On 11/09/2011 18:32, Krunal Desai wrote: >> On Sep 11, 2011, at 13:01 , Richard Elling wrote: >>> The removed state can be the result of a transport issue. If this is a >>> Solaris-based >>> OS, then look at "fmadm faulty" for a diagnosis leading to a removal. If >>> none, >>> then look at "fmdump -eV" for errors relating to the disk. Last, check >>> the "zpool >>> history" to make sure one of those little imps didn''t issue a "zpool >>> remove" >>> command. >> >> Definitely check your cabling; a few of my drives disappeared like this as >> ''REMOVED'', turned out to be some loose SATA cables on my backplane. >> >> --khd > > Thanks guys, > > I reinstalled the drive after testing on the windows machine and it > looks fine now. By the time I''d got on to the console it had already > started resilvering. All done now and hopefully it will stay like that > for a while. > > Thanks again, saved me some work > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- Sent from my mobile device =================Belenix: www.belenix.org
On Mon, Sep 12, 2011 at 12:52:42AM +0100, Matt Harrison wrote:> On 11/09/2011 18:32, Krunal Desai wrote: > >On Sep 11, 2011, at 13:01 , Richard Elling wrote: > >>The removed state can be the result of a transport issue. If this is a > >>Solaris-based > >>OS, then look at "fmadm faulty" for a diagnosis leading to a removal. If > >>none, > >>then look at "fmdump -eV" for errors relating to the disk. Last, check > >>the "zpool > >>history" to make sure one of those little imps didn''t issue a "zpool > >>remove" > >>command. > > > >Definitely check your cabling; a few of my drives disappeared like this as > >''REMOVED'', turned out to be some loose SATA cables on my backplane. > > > >--khd > > Thanks guys, > > I reinstalled the drive after testing on the windows machine and it > looks fine now. By the time I''d got on to the console it had already > started resilvering. All done now and hopefully it will stay like that > for a while.Hmmm, at least if S11x, ZFS mirror, ICH10 and cmdk (IDE) driver is involved, I''m 99.9% confident, that "a while" turns out to be some days or weeks, only - no matter what Platinium-Enterprise-HDDs you use ;-) Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
On 09/12/11 10:33, Jens Elkner wrote:> Hmmm, at least if S11x, ZFS mirror, ICH10 and cmdk (IDE) driver is involved, > I''m 99.9% confident, that "a while" turns out to be some days or weeks, only > - no matter what Platinium-Enterprise-HDDs you use ;-)On Solaris 11 Express with a dual drive mirror, ICH10 and the AHCI driver (not sure why you would purposely choose to run in IDE mode) resilvering a 1TB drive (Seagate ST310005N1A1AS-RK) went at a rate of 3.2GB/min. Deduplication was not enabled. Only hours for a 55% full mirror, not days or weeks.
On Mon, Sep 12, 2011 at 12:50:37PM -0400, John Martin wrote:> On 09/12/11 10:33, Jens Elkner wrote: > > >Hmmm, at least if S11x, ZFS mirror, ICH10 and cmdk (IDE) driver is > >involved, > >I''m 99.9% confident, that "a while" turns out to be some days or weeks, > >only > >- no matter what Platinium-Enterprise-HDDs you use ;-) > > On Solaris 11 Express with a dual drive mirror, ICH10 and the AHCI > driver (not sure why you would purposely choose to run in IDE mode)Because some vendors think, providing RAID and native IDE is sufficient, i.e. you can''t simply choose AHCI in your BIOS ...> resilvering a 1TB drive (Seagate ST310005N1A1AS-RK) went at a rate of > 3.2GB/min. Deduplication was not enabled. Only hours for a 55% > full mirror, not days or weeks.Ohhh, I wasn''t referring resilver time, I meant "stays functional" ;-) Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768