Hi All, I'm having a few issues and I could use another brain on this. I have a machine which is throwing up ata errors like this: Jul 22 00:16:16 <kern.crit> mail /kernel: ad2s1e: soft error (ECC corrected) reading fsbn 77465920 of 38732960-38733087 (ad2s1 bn 77465920; cn 461106 tn 18 sn 4)ad2s1e: hard error reading fsbn 77465920 of 38732960-38733087 (ad2s1 bn 77465920; cn 461106 tn 18 sn 4)ad2: timeout waiting for cmd=ef s=00 e=7f Jul 22 00:16:16 <kern.crit> mail /kernel: trying PIO mode Jul 22 00:16:18 <kern.crit> mail /kernel: ad2s1e: soft error (ECC corrected) reading fsbn 77465920 of 38732960-38733087 (ad2s1 bn 77465920; cn 461106 tn 18 sn 4)ad2s1e: hard error reading fsbn 77465920 of 38732960-38733087 (ad2s1 bn 77465920; cn 461106 tn 18 sn 4) status=7f error=7f Jul 22 00:16:18 <kern.crit> mail /kernel: ad2: timeout waiting for DRQ - resetting Jul 22 00:16:18 <kern.crit> mail /kernel: ata1: resetting devices .. Jul 22 00:16:18 <kern.crit> mail /kernel: ad2: removed from configuration Jul 22 00:16:18 <kern.crit> mail /kernel: done It was doing this a few weeks ago, so I swapped the disk out and it was happy for a bit. Now it's coming back and doing it on the new disk (which is in the same place as the old one (sec-master)). I figure this means the controller has gone bork? relevant bits of dmesg: atapci0: <Intel ICH ATA66 controller> port 0xf000-0xf00f at device 31.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 ad0: 19541MB <Maxtor 2B020H1> [39703/16/63] at ata0-master UDMA66 ad2: 76293MB <Maxtor 6Y080L0> [155009/16/63] at ata1-master UDMA66 FreeBSD dexter.alink.co.za 4.9-RELEASE-p10 FreeBSD 4.9-RELEASE-p10 #0: Fri Jun 11 17:16:17 BST 2004 george@dexter.alink.co.za:/usr/obj/usr/src/sys/DEXTER i386 atapci0@pci0:31:1: class=0x010180 card=0x24118086 chip=0x24118086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82801AA IDE Controller (UltraATA/66)' class = mass storage subclass = ATA ATA channel 0: Master: ad0 <Maxtor 2B020H1/WAH21PB0> ATA/ATAPI rev 6 Slave: no device present ATA channel 1: Master: ad2 <Maxtor 6Y080L0/YAR41BW0> ATA/ATAPI rev 7 Slave: no device present Any advice would be appreciated, but please cc me in a reply. Many thanks :) George -- George Barnett Reality Engineer & Explorer e: george@alink.co.za m: +44 778 884 7205 Things must be as they may - William Shakespear, Henry V
On Thu, Jul 22, 2004 at 10:32:20AM +0100, George Barnett wrote:> Hi All, > > I'm having a few issues and I could use another brain on this. I have a > machine which is throwing up ata errors like this: > > Jul 22 00:16:16 <kern.crit> mail /kernel: ad2s1e: soft error (ECC > corrected) reading fsbn 77465920 of 38732960-38733087 (ad2s1 bn > 77465920; cn 461106 tn 18 sn 4)ad2s1e: hard error reading fsbn 77465920 > of 38732960-38733087 (ad2s1 bn 77465920; cn 461106 tn 18 sn 4)ad2: > timeout waiting for cmd=ef s=00 e=7f > Jul 22 00:16:16 <kern.crit> mail /kernel: trying PIO mode > Jul 22 00:16:18 <kern.crit> mail /kernel: ad2s1e: soft error (ECC > corrected) reading fsbn 77465920 of 38732960-38733087 (ad2s1 bn > 77465920; cn 461106 tn 18 sn 4)ad2s1e: hard error reading fsbn 77465920 > of 38732960-38733087 (ad2s1 bn 77465920; cn 461106 tn 18 sn 4) status=7f > error=7f > Jul 22 00:16:18 <kern.crit> mail /kernel: ad2: timeout waiting for DRQ - > resetting > Jul 22 00:16:18 <kern.crit> mail /kernel: ata1: resetting devices .. > Jul 22 00:16:18 <kern.crit> mail /kernel: ad2: removed from configuration > Jul 22 00:16:18 <kern.crit> mail /kernel: done > > It was doing this a few weeks ago, so I swapped the disk out and it was > happy for a bit. Now it's coming back and doing it on the new disk > (which is in the same place as the old one (sec-master)).Did you check the failed disk using Maxtor's utility, and did it report any errors? I have about the same situation -- 4.10-STABLE reporting a read error every few days, each time on a different disk, but every time it turned out that the disk really was good according to Maxtor's tests :(> I figure this means the controller has gone bork?That's also a possibility, but I've seen it happen on 3 different controllers already, which imho is too much coincidence for it to really be a controller failure. I'm still in the dark as to what it really is though. It could very well still be a hardware error in my case (heat and memory not yet ruled out). --Stijn -- "Linux has many different distributions, meaning that you can probably find one that is exactly what you want (I even found one that looked like a Unix system)." -- Mike Meyer, from a posting at questions@freebsd.org -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20040722/315a0f9d/attachment.bin
> I figure this means the controller has gone bork?Maybe yes, but I've have nearly the same situation with Maxtor IDE HDDs. We've bought about 10 Maxtor 6E040L0 HDDs in Dec2003, and in May2004 4(!) of them are gone with the same result as in your case. Maybe it was a bad consignment, becouse the same HDDs bought two month earlier or later works fine... Check temperature in the case - I assume they don't like even small overheat. Alexander Vasenin aka BlackSir
On Thursday, 22 July 2004 at 10:32:20 +0100, George Barnett wrote:> Hi All, > > I'm having a few issues and I could use another brain on this. I have a > machine which is throwing up ata errors like this: > > Jul 22 00:16:16 <kern.crit> mail /kernel: ad2s1e: soft error (ECC > corrected) reading fsbn 77465920 of 38732960-38733087 (ad2s1 bn > 77465920; cn 461106 tn 18 sn 4)ad2s1e: hard error reading fsbn 77465920 > of 38732960-38733087 (ad2s1 bn 77465920; cn 461106 tn 18 sn 4)ad2: > timeout waiting for cmd=ef s=00 e=7f > Jul 22 00:16:16 <kern.crit> mail /kernel: trying PIO mode > Jul 22 00:16:18 <kern.crit> mail /kernel: ad2s1e: soft error (ECC > corrected) reading fsbn 77465920 of 38732960-38733087 (ad2s1 bn > 77465920; cn 461106 tn 18 sn 4)ad2s1e: hard error reading fsbn 77465920 > of 38732960-38733087 (ad2s1 bn 77465920; cn 461106 tn 18 sn 4) status=7f > error=7f > Jul 22 00:16:18 <kern.crit> mail /kernel: ad2: timeout waiting for DRQ - > resetting > Jul 22 00:16:18 <kern.crit> mail /kernel: ata1: resetting devices .. > Jul 22 00:16:18 <kern.crit> mail /kernel: ad2: removed from configuration > Jul 22 00:16:18 <kern.crit> mail /kernel: doneHi, George. Meybe it's just a shoot in the darkness, but when I saw this message couple of days ago, I simply replased a power cable, going from a power unit. I think that all problem in a socket of that cable. Anyway troubles have disappeared after that. Best regards, Nikolay Pavlov.