Hi all, I have a Dell R710 server running 8.0R/amd64, with a PERC 6 RAID controller and four SAS drives in a RAID10 configuration. The RAID controller does a weekly "patrol read" that threw up a load of errors in the most recent run: +=======================================================================+seqNum: 0x00000b5e +Time: Tue Aug 3 22:06:15 2010 + +Code: 0x00000071 +Class: 0 +Locale: 0x02 +Event Description: Unexpected sense: PD 02(e0x20/s2) Path 5000c5000561dfc9, CDB: 2f 00 19 21 40 00 00 10 00 00, Sense: 3/11/00 +Event Data: + Device ID: 2 + Enclosure Index: 32 + Slot Number: 2 + CDB Length: 10 + CDB Data: + 002f 0000 0019 0021 0040 0000 0000 0010 0000 0000 0000 0000 0000 0000 0000 0000 Sense Length: 18 + Sense Data: + 00f0 0000 0003 0019 0021 004b 00e1 000a 0000 0000 0000 0000 0011 0000 0081 0080 0000 0097 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 + +=======================================================================+seqNum: 0x00000b5f +Time: Tue Aug 3 22:06:15 2010 + +Code: 0x0000005d +Class: 0 +Locale: 0x02 +Event Description: Patrol Read corrected medium error on PD 02(e0x20/s2) at 19214be1 +Event Data: + Device ID: 2 + Enclosure Index: 32 + Slot Number: 2 + LBA: 421612513 + +======================================================================= ...and a lot more of the same. Everything else on the machine is still working fine as far as I can tell, and the status of the RAID volume is still reported as "Optimal": Checking status of MFI RAID controllers: Adapter: 0 ------------------------------------------------------------------------ Physical Drive Information: ENC SLO DEV SEQ MEC OEC PFC LPF STATE 32 0 0 2 0 0 0 0 Online 32 1 1 2 0 0 0 0 Online 32 2 2 2 234 0 0 0 Online 32 3 3 2 0 0 0 0 Online Virtual Drive Information: VD DRV RLP RLS RLQ STS SIZE STATE NAME 0 2 1 3 0 64kB 1143552MB Optimal SVN BBU Information: TYPE TEMP OK RSOC ASOC RC CC ME BBU 20C Yes 92% 77% 1377mAh 7 2% I'm not sure how I should interpret these errors and what action, if any, I should take. Do I need to replace - or at least rebuild - the offending drive? Can I do that safely without taking the machine down? The box is covered by Dell support but I'd like to get all my facts straight before I call them and they try to pin a hardware problem on this "FreeBSD" thing they've never heard of before... Many thanks, Scott
On 08-08-2010 12:07, Scott Mitchell wrote:> Hi all, > > I have a Dell R710 server running 8.0R/amd64, with a PERC 6 RAID controller > and four SAS drives in a RAID10 configuration. The RAID controller does a > weekly "patrol read" that threw up a load of errors in the most recent run: > >Hello, I have two of the same servers running 8.1 (just upgraded from 8.0) amd64. I run ZFS so I just run the drives as JBOD (well, the closest thing to JBOD this controller can get, a bunch of 1-disk raid0's). I just wanted to say that I haven't seen any errors like you describe, maybe they only appear when using the PERC raid functionality, or perhaps the disk really is going bad. My servers are close to going into production so any testing from my side will have to be without destroying filesystems, but let me know if you need anything :) Regards & good luck, Thomas Steen Rasmussen
Svein Skogen (Listmail account)
2010-Aug-08 15:25 UTC
Patrol read errors on Dell Perc 6...
On 08.08.2010 12:07, Scott Mitchell wrote:> Hi all, > > I have a Dell R710 server running 8.0R/amd64, with a PERC 6 RAID controller > and four SAS drives in a RAID10 configuration. The RAID controller does a > weekly "patrol read" that threw up a load of errors in the most recent run: >*SNIP* Patrol reads are done internally in the controller firmware. You're getting warnings that a disk is failing. Do with that information what you feel is necessary. //Svein -- --------+-------------------+------------------------------- /"\ |Svein Skogen | svein@d80.iso100.no \ / |Solberg ?stli 9 | PGP Key: 0xE5E76831 X |2020 Skedsmokorset | svein@jernhuset.no / \ |Norway | PGP Key: 0xCE96CE13 | | svein@stillbilde.net ascii | | PGP Key: 0x58CD33B6 ribbon |System Admin | svein-listmail@stillbilde.net Campaign|stillbilde.net | PGP Key: 0x22D494A4 +-------------------+------------------------------- |msn messenger: | Mobile Phone: +47 907 03 575 |svein@jernhuset.no | RIPE handle: SS16503-RIPE --------+-------------------+------------------------------- If you really are in a hurry, mail me at svein-mobile@stillbilde.net This mailbox goes directly to my cellphone and is checked even when I'm not in front of my computer. ------------------------------------------------------------ Picture Gallery: https://gallery.stillbilde.net/v/svein/ ------------------------------------------------------------ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20100808/c1383014/signature.pgp