Hi, I have a SATA PCIe 6Gbps 4 port controller card made by Startech. The kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as Marvell Technology Group Ltd. 88SE9123 I use it to provide extra SATA ports to a raid system. The HD's are all "WD2003FYYS" and so run at 3Gbps on the 6Gbps controller. However I am seeing lots of instances of errors like this ----------------------------------------- Jun 22 03:13:23 viz1 kernel: ata13.00: exception Emask 0x10 SAct 0x4 SErr 0x400000 action 0x6 frozen Jun 22 03:13:23 viz1 kernel: ata13.00: irq_stat 0x08000000, interface fatal error Jun 22 03:13:23 viz1 kernel: ata13: SError: { Handshk } Jun 22 03:13:23 viz1 kernel: ata13.00: failed command: WRITE FPDMA QUEUED Jun 22 03:13:23 viz1 kernel: ata13.00: cmd 61/e8:10:98:05:1b/01:00:66:00:00/40 tag 2 ncq 249856 out Jun 22 03:13:23 viz1 kernel: ata13.00: status: { DRDY } Jun 22 03:13:23 viz1 kernel: ata13: hard resetting link Jun 22 03:13:24 viz1 kernel: ata13: SATA link up 3.0 Gbps (SStatus 123 SControl 330) Jun 22 03:13:24 viz1 kernel: ata13.00: configured for UDMA/133 Jun 22 03:13:24 viz1 kernel: ata13: EH complete --------------------------------------- Vendor ID : 1b4b Device ID : 9123 I tried to see what drivers were currently being used but the command below gave nothing grep -i 1b4b /lib/modules/*/modules.alias | grep -i 9123 I have changed the card and cables but still get the same errors. I am wondering if the el6 kernel is using the correct drivers I checked "elrepo" against the "Vendor:Device ID pairing" and it also came up blank. Any ideas would be much appreciated. Regards, Steve
On Fri, 22 Jun 2012, Reindl Harald wrote:> > > Am 22.06.2012 13:58, schrieb Steve Brooks: >> I have a SATA PCIe 6Gbps 4 port controller card made by Startech. The >> kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as >> >> Marvell Technology Group Ltd. 88SE9123 >> >> I use it to provide extra SATA ports to a raid system. >> The HD's are all "WD2003FYYS" and so run at 3Gbps on the 6Gbps controller. >> However I am seeing lots of instances of errors like this >> >> ----------------------------------------- >> >> Jun 22 03:13:23 viz1 kernel: ata13.00: exception Emask 0x10 SAct 0x4 SErr >> 0x400000 action 0x6 frozen >> Jun 22 03:13:23 viz1 kernel: ata13.00: irq_stat 0x08000000, interface >> fatal error >> Jun 22 03:13:23 viz1 kernel: ata13: SError: { Handshk } >> Jun 22 03:13:23 viz1 kernel: ata13.00: failed command: WRITE FPDMA QUEUED >> Jun 22 03:13:23 viz1 kernel: ata13.00: cmd >> 61/e8:10:98:05:1b/01:00:66:00:00/40 tag 2 ncq 249856 out >> Jun 22 03:13:23 viz1 kernel: ata13.00: status: { DRDY } >> Jun 22 03:13:23 viz1 kernel: ata13: hard resetting link >> Jun 22 03:13:24 viz1 kernel: ata13: SATA link up 3.0 Gbps (SStatus 123 >> SControl 330) >> Jun 22 03:13:24 viz1 kernel: ata13.00: configured for UDMA/133 >> Jun 22 03:13:24 viz1 kernel: ata13: EH complete >> >> --------------------------------------- >> >> Vendor ID : 1b4b >> Device ID : 9123 >> >> I tried to see what drivers were currently being used but the command >> below gave nothing > > > why do you care for drivers? > > this looks like dying hard-drives are always looking in syslogHi Reindl, I should have mentioned I swapped out the hard-drive and same errors on new drive. I checked the SMART attributes of the drive and nothing untoward, also executed the smartctl -long .... test wich came back error free. Steve -- Dr Stephen Brooks http://www-solar.mcs.st-and.ac.uk/ Solar MHD Theory Group Tel :: 01334 463735 Fax :: 01334 463748 E-mail :: steveb at mcs.st-andrews.ac.uk --------------------------------------- Mathematical Institute North Haugh University of St. Andrews St Andrews, Fife KY16 9SS SCOTLAND ---------------------------------------
Steve Brooks wrote:> > I have a SATA PCIe 6Gbps 4 port controller card made by Startech. The > kernel (Linux viz1 2.6.32-220.4.1.el6.x86_64) sees it as > > Marvell Technology Group Ltd. 88SE9123 > > I use it to provide extra SATA ports to a raid system. > The HD's are all "WD2003FYYS" and so run at 3Gbps on the 6Gbps controller. > However I am seeing lots of instances of errors like this > > ----------------------------------------- > > Jun 22 03:13:23 viz1 kernel: ata13.00: exception Emask 0x10 SAct 0x4 SErr > 0x400000 action 0x6 frozen > Jun 22 03:13:23 viz1 kernel: ata13.00: irq_stat 0x08000000, interface > fatal error > Jun 22 03:13:23 viz1 kernel: ata13: SError: { Handshk } > Jun 22 03:13:23 viz1 kernel: ata13.00: failed command: WRITE FPDMA QUEUED > Jun 22 03:13:23 viz1 kernel: ata13.00: cmd > 61/e8:10:98:05:1b/01:00:66:00:00/40 tag 2 ncq 249856 out > Jun 22 03:13:23 viz1 kernel: ata13.00: status: { DRDY } > Jun 22 03:13:23 viz1 kernel: ata13: hard resetting link<snip> Crap. First question: what make & model are the drives on it? If they're Caviar Green, you're hosed. WD, and *maybe* Seagate as well, disabled a certain function you used to be able to set on the lower cost, consumer-grade models (in '09, I believe), and so when a server controller is trying to do i/o, and has a problem, in server-grade drives, it gives up after something like 6 sec, and does error handling, I *think* to other sectors. The consumer ones, on the other hand, keep trying for 1? 2? *minutes*; the disabled function allowed a used to tell it to give up in a shorter time. Meanwhile, a hardware controller will, as I said, have fits. mark "you'd think I just spent months dealing with this...."