My system has been working reliably with 6.3 for quite some time... when I rebooted into single user mode to do the installworld with the 7.0-RELEASE kernel, the install died about halfway through with READ_DMA TIMEOUT errors. Since I had a mixed system at that point, I set hw.ata.ata_dma="0" in /boot/loader.conf and completed the install. After a good boot to verify everything was working, I flipped hw.ata.ata_dma back and rebooted. The corrupted sync message scared the heck out of me: Waiting (max 60 seconds) for system process `vnlru' to stop...done Waiti Synncgi n(gm adxi sk6s0, svencoodnedss )r efmoari nsiynsgte.m. .pr1o0c ess `syncer' to stop...8 7 8 3 3 3 1 0 0 0 0 done And after the reboot, the READ_DMA timeouts were back. I installed sysutils/smartmontools (output attached in case it's usefull) The only "odd" think I can think of about my system is an unusually high HZ value (2386) I'm building a kernel now with 1000 to check if that makes a difference. Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.0-RELEASE #1: Tue Feb 26 22:49:13 PST 2008 root@server.hurd.local:/usr/obj/usr/src/sys/SERVER Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel Pentium III (996.85-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x686 Stepping = 6 Features=0x387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE> real memory = 1207959552 (1152 MB) avail memory = 1172832256 (1118 MB) MPTable: <AMI CNB30LE > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ioapic0: Assuming intbase of 0 ioapic1: Assuming intbase of 16 ioapic0 <Version 1.1> irqs 0-15 on motherboard ioapic1 <Version 1.1> irqs 16-31 on motherboard kbd1 at kbdmux0 cpu0 on motherboard cpu1 on motherboard pcib0: <MPTable Host-PCI bridge> pcibus 0 on motherboard pci0: <PCI bus> on pcib0 vgapci0: <VGA-compatible display> port 0xd800-0xd8ff mem 0xfd000000-0xfdffffff,0xfeaff000-0xfeafffff irq 22 at device 1.0 on pci0 drm0: <Rage XL> on vgapci0 info: [drm] Initialized mach64 1.0.0 20020904 fxp0: <Intel 82559 Pro/100 Ethernet> port 0xd400-0xd43f mem 0xfeafe000-0xfeafefff,0xfe900000-0xfe9fffff irq 20 at device 4.0 on pci0 miibus0: <MII bus> on fxp0 inphy0: <i82555 10/100 media interface> PHY 1 on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp0: Ethernet address: 00:e0:81:21:49:7e fxp0: [ITHREAD] fxp1: <Intel 82559 Pro/100 Ethernet> port 0xd000-0xd03f mem 0xfeafd000-0xfeafdfff,0xfe700000-0xfe7fffff irq 21 at device 5.0 on pci0 miibus1: <MII bus> on fxp1 inphy1: <i82555 10/100 media interface> PHY 1 on miibus1 inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp1: Ethernet address: 00:e0:81:21:49:7f fxp1: [ITHREAD] isab0: <PCI-ISA bridge> port 0x580-0x58f at device 15.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <ServerWorks ROSB4 UDMA33 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 15.1 on pci0 ata0: <ATA channel 0> on atapci0 ata0: [ITHREAD] ata1: <ATA channel 1> on atapci0 ata1: [ITHREAD] ohci0: <OHCI (generic) USB controller> mem 0xfeafc000-0xfeafcfff irq 10 at device 15.2 on pci0 ohci0: [GIANT-LOCKED] ohci0: [ITHREAD] usb0: OHCI version 1.0, legacy support usb0: <OHCI (generic) USB controller> on ohci0 usb0: USB revision 1.0 uhub0: <(0x1166) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0 uhub0: 2 ports with 2 removable, self powered pcib1: <MPTable Host-PCI bridge> pcibus 1 on motherboard pci1: <PCI bus> on pcib1 pcm0: <Creative CT5880-C> port 0xef00-0xef3f irq 27 at device 2.0 on pci1 pcm0: <SigmaTel STAC9721/23 AC97 Codec> pcm0: [ITHREAD] pcm0: <Playback: DAC1,DAC2 / Record: ADC> pmtimer0 on isa0 orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff,0xc9000-0xc9fff pnpid ORM0000 on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model MouseMan+, device ID 0 fdc0: <Enhanced floppy controller> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: [FILTER] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold ppbus0: <Parallel port bus> on ppc0 ppbus0: [ITHREAD] lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppc0: [GIANT-LOCKED] ppc0: [ITHREAD] sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio0: [FILTER] sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled unknown: <PNP0c01> can't assign resources (memory) unknown: <PNP0303> can't assign resources (port) speaker0: <PC speaker> at port 0x61 pnpid PNP0800 on isa0 unknown: <PNP0c02> can't assign resources (port) unknown: <PNP0f13> can't assign resources (irq) unknown: <PNP0501> can't assign resources (port) unknown: <PNP0401> can't assign resources (port) unknown: <PNP0700> can't assign resources (port) uhid0: <Logitech Inc. WingMan RumblePad, class 0/0, rev 1.00/1.07, addr 2> on uhub0 uhub0: device problem (IOERROR), disabling port 2 Timecounters tick every 0.838 msec ipfw2 (+ipv6) initialized, divert enabled, rule-based forwarding disabled, default to deny, logging disabled ad0: 239372MB <Maxtor 7Y250P0 YAR41BW0> at ata0-master PIO4 acd0: CDRW <HL-DT-ST GCE-8520B/1.04> at ata1-master UDMA33 SMP: AP CPU #1 Launched! Trying to mount root from ufs:/dev/ad0s1a fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp1: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp1: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 6 WARNING: attempt to net_add_domain(netgraph) after domainfinalize() fxp1: Microcode loaded, int_delay: 1000 usec bundle_max: 6 vmnet0: Ethernet address: 00:bd:d3:f1:01:00 vmnet1: Ethernet address: 00:bd:dc:f1:01:01 vmnet2: Ethernet address: 00:bd:de:f1:01:02 vmnet3: Ethernet address: 00:bd:16:f2:01:03 vmnet4: Ethernet address: 00:bd:1e:b9:02:04 vmnet5: Ethernet address: 00:bd:20:b9:02:05 vmnet6: Ethernet address: 00:bd:22:b9:02:06 vmnet7: Ethernet address: 00:bd:23:b9:02:07 bridge0: Ethernet address: da:a1:67:37:d0:c4 fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp0: promiscuous mode enabled vmnet0: promiscuous mode enabled vmnet1: promiscuous mode enabled vmnet2: promiscuous mode enabled vmnet3: promiscuous mode enabled Waiting (max 60 seconds) for system process `vnlru' to stop...done Waiti Synncgi n(gm adxi sk6s0, svencoodnedss )r efmoari nsiynsgte.m. .pr1o0c ess `syncer' to stop...8 7 8 3 3 3 1 0 0 0 0 done Waiting (max 60 seconds) for system process `bufdaemon' to stop...done All buffers synced. Uptime: 29m46s Rebooting... cpu_reset: Stopping other CPUs Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.0-RELEASE #1: Tue Feb 26 22:49:13 PST 2008 root@server.hurd.local:/usr/obj/usr/src/sys/SERVER Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel Pentium III (996.85-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x686 Stepping = 6 Features=0x387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE> real memory = 1207959552 (1152 MB) avail memory = 1172832256 (1118 MB) MPTable: <AMI CNB30LE > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ioapic0: Assuming intbase of 0 ioapic1: Assuming intbase of 16 ioapic0 <Version 1.1> irqs 0-15 on motherboard ioapic1 <Version 1.1> irqs 16-31 on motherboard kbd1 at kbdmux0 cpu0 on motherboard cpu1 on motherboard pcib0: <MPTable Host-PCI bridge> pcibus 0 on motherboard pci0: <PCI bus> on pcib0 vgapci0: <VGA-compatible display> port 0xd800-0xd8ff mem 0xfd000000-0xfdffffff,0xfeaff000-0xfeafffff irq 22 at device 1.0 on pci0 drm0: <Rage XL> on vgapci0 info: [drm] Initialized mach64 1.0.0 20020904 fxp0: <Intel 82559 Pro/100 Ethernet> port 0xd400-0xd43f mem 0xfeafe000-0xfeafefff,0xfe900000-0xfe9fffff irq 20 at device 4.0 on pci0 miibus0: <MII bus> on fxp0 inphy0: <i82555 10/100 media interface> PHY 1 on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp0: Ethernet address: 00:e0:81:21:49:7e fxp0: [ITHREAD] fxp1: <Intel 82559 Pro/100 Ethernet> port 0xd000-0xd03f mem 0xfeafd000-0xfeafdfff,0xfe700000-0xfe7fffff irq 21 at device 5.0 on pci0 miibus1: <MII bus> on fxp1 inphy1: <i82555 10/100 media interface> PHY 1 on miibus1 inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp1: Ethernet address: 00:e0:81:21:49:7f fxp1: [ITHREAD] isab0: <PCI-ISA bridge> port 0x580-0x58f at device 15.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <ServerWorks ROSB4 UDMA33 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 15.1 on pci0 ata0: <ATA channel 0> on atapci0 ata0: [ITHREAD] ata1: <ATA channel 1> on atapci0 ata1: [ITHREAD] ohci0: <OHCI (generic) USB controller> mem 0xfeafc000-0xfeafcfff irq 10 at device 15.2 on pci0 ohci0: [GIANT-LOCKED] ohci0: [ITHREAD] usb0: OHCI version 1.0, legacy support usb0: <OHCI (generic) USB controller> on ohci0 usb0: USB revision 1.0 uhub0: <(0x1166) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0 uhub0: 2 ports with 2 removable, self powered pcib1: <MPTable Host-PCI bridge> pcibus 1 on motherboard pci1: <PCI bus> on pcib1 pcm0: <Creative CT5880-C> port 0xef00-0xef3f irq 27 at device 2.0 on pci1 pcm0: <SigmaTel STAC9721/23 AC97 Codec> pcm0: [ITHREAD] pcm0: <Playback: DAC1,DAC2 / Record: ADC> pmtimer0 on isa0 orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff,0xc9000-0xc9fff pnpid ORM0000 on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model MouseMan+, device ID 0 fdc0: <Enhanced floppy controller> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: [FILTER] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold ppbus0: <Parallel port bus> on ppc0 ppbus0: [ITHREAD] lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppc0: [GIANT-LOCKED] ppc0: [ITHREAD] sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio0: [FILTER] sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled unknown: <PNP0c01> can't assign resources (memory) unknown: <PNP0303> can't assign resources (port) speaker0: <PC speaker> at port 0x61 pnpid PNP0800 on isa0 unknown: <PNP0c02> can't assign resources (port) unknown: <PNP0f13> can't assign resources (irq) unknown: <PNP0501> can't assign resources (port) unknown: <PNP0401> can't assign resources (port) unknown: <PNP0700> can't assign resources (port) uhid0: <Logitech Inc. WingMan RumblePad, class 0/0, rev 1.00/1.07, addr 2> on uhub0 uhub0: device problem (IOERROR), disabling port 2 Timecounters tick every 0.838 msec ipfw2 (+ipv6) initialized, divert enabled, rule-based forwarding disabled, default to deny, logging disabled ad0: 239372MB <Maxtor 7Y250P0 YAR41BW0> at ata0-master UDMA33 acd0: CDRW <HL-DT-ST GCE-8520B/1.04> at ata1-master UDMA33 SMP: AP CPU #1 Launched! Trying to mount root from ufs:/dev/ad0s1a fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp1: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp1: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 6 WARNING: attempt to net_add_domain(netgraph) after domainfinalize() fxp1: Microcode loaded, int_delay: 1000 usec bundle_max: 6 ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=56943008 ad0: TIMEOUT - READ_DMA retrying (0 retries left) LBA=56943008 ad0: FAILURE - READ_DMA timed out LBA=56943008 g_vfs_done():ad0s1f[READ(offset=24591417344, length=65536)]error = 5 vnode_pager_getpages: I/O read error vmnet0: Ethernet address: 00:bd:2e:71:02:00 vmnet1: Ethernet address: 00:bd:3a:71:02:01 vmnet2: Ethernet address: 00:bd:3b:71:02:02 vmnet3: Ethernet address: 00:bd:66:71:02:03 vmnet4: Ethernet address: 00:bd:a6:f0:02:04 vmnet5: Ethernet address: 00:bd:a8:f0:02:05 vmnet6: Ethernet address: 00:bd:a9:f0:02:06 vmnet7: Ethernet address: 00:bd:ab:f0:02:07 bridge0: Ethernet address: e6:d3:5f:ee:dd:23 fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp0: promiscuous mode enabled vmnet0: promiscuous mode enabled vmnet1: promiscuous mode enabled vmnet2: promiscuous mode enabled vmnet3: promiscuous mode enabled ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=6815744 ad0: TIMEOUT - READ_DMA retrying (0 retries left) LBA=6815744 ad0: FAILURE - READ_DMA timed out LBA=6815744 ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=6815744 ad0: TIMEOUT - READ_DMA retrying (0 retries left) LBA=6815744 ad0: FAILURE - READ_DMA timed out LBA=6815744 ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=6815744 ad0: TIMEOUT - READ_DMA retrying (0 retries left) LBA=6815744 ad0: FAILURE - READ_DMA timed out LBA=6815744 ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=59969424 ad0: TIMEOUT - READ_DMA retrying (0 retries left) LBA=59969424 ad0: FAILURE - READ_DMA timed out LBA=59969424 g_vfs_done():ad0s1f[READ(offset=26140942336, length=65536)]error = 5 vnode_pager_getpages: I/O read error vm_fault: pager read error, pid 1930 (kdm-bin_greet) ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=155868384 ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=155868512 ad0: TIMEOUT - READ_DMA retrying (0 retries left) LBA=155868384 ad0: TIMEOUT - READ_DMA retrying (0 retries left) LBA=155868512 ad0: FAILURE - READ_DMA timed out LBA=155868384 ad0: FAILURE - READ_DMA timed out LBA=155868512 g_vfs_done():ad0s1g[READ(offset=21554118656, length=131072)]error = 5 pid 1930 (kdm-bin_greet), uid 0: exited on signal 11 (core dumped) -------------- next part -------------- smartctl version 5.37 [i386-portbld-freebsd7.0] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION ==Model Family: Maxtor MaXLine Plus II Device Model: Maxtor 7Y250P0 Serial Number: Y61WGHYE Firmware Version: YAR41BW0 User Capacity: 251,000,193,024 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 7 ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 Local Time is: Wed Feb 27 00:45:51 2008 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION ==SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 363) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. No General Purpose Logging support. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 107) minutes. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 3 Spin_Up_Time 0x0027 188 179 063 Pre-fail Always - 26142 4 Start_Stop_Count 0x0032 253 253 000 Old_age Always - 147 5 Reallocated_Sector_Ct 0x0033 253 253 063 Pre-fail Always - 4 6 Read_Channel_Margin 0x0001 253 253 100 Pre-fail Offline - 0 7 Seek_Error_Rate 0x000a 253 252 000 Old_age Always - 0 8 Seek_Time_Performance 0x0027 252 245 187 Pre-fail Always - 58440 9 Power_On_Minutes 0x0032 219 219 000 Old_age Always - 1052h+05m 10 Spin_Retry_Count 0x002b 253 252 157 Pre-fail Always - 0 11 Calibration_Retry_Count 0x002b 253 252 223 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 253 253 000 Old_age Always - 251 192 Power-Off_Retract_Count 0x0032 253 253 000 Old_age Always - 0 193 Load_Cycle_Count 0x0032 253 253 000 Old_age Always - 0 194 Temperature_Celsius 0x0032 253 253 000 Old_age Always - 48 195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age Always - 11498 196 Reallocated_Event_Count 0x0008 253 253 000 Old_age Offline - 0 197 Current_Pending_Sector 0x0008 253 253 000 Old_age Offline - 0 198 Offline_Uncorrectable 0x0008 253 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0008 199 199 000 Old_age Offline - 0 200 Multi_Zone_Error_Rate 0x000a 253 252 000 Old_age Always - 0 201 Soft_Read_Error_Rate 0x000a 253 252 000 Old_age Always - 14 202 TA_Increase_Count 0x000a 253 252 000 Old_age Always - 0 203 Run_Out_Cancel 0x000b 253 252 180 Pre-fail Always - 11 204 Shock_Count_Write_Opern 0x000a 253 252 000 Old_age Always - 0 205 Shock_Rate_Write_Opern 0x000a 253 252 000 Old_age Always - 0 207 Spin_High_Current 0x002a 253 252 000 Old_age Always - 0 208 Spin_Buzz 0x002a 253 252 000 Old_age Always - 0 209 Offline_Seek_Performnce 0x0024 198 194 000 Old_age Offline - 0 99 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0 100 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0 101 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0 SMART Error Log Version: 1 ATA Error Count: 2 CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 2 occurred at disk power-on lifetime: 5171 hours (215 days + 11 hours) When the command that caused the error occurred, the device was in an unknown state. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 10 80 00 c8 e6 Error: UNC 16 sectors at LBA = 0x06c80080 = 113770624 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 10 80 00 c8 e6 08 00:13:29.312 READ DMA c8 00 04 98 14 02 e0 08 00:13:18.144 READ DMA c8 00 04 3c 69 06 e0 08 00:13:18.144 READ DMA c8 00 10 50 44 06 e0 08 00:13:09.008 READ DMA c8 00 0c 80 14 04 e0 08 00:13:09.008 READ DMA Error 1 occurred at disk power-on lifetime: 5171 hours (215 days + 11 hours) When the command that caused the error occurred, the device was in an unknown state. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 10 80 00 c8 e6 Error: UNC 16 sectors at LBA = 0x06c80080 = 113770624 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 10 80 00 c8 e6 08 00:13:07.840 READ DMA c8 00 01 00 00 00 e0 08 00:13:07.840 READ DMA c8 00 02 00 00 00 e0 08 00:13:07.840 READ DMA c8 00 01 01 00 00 e0 08 00:13:07.840 READ DMA c8 00 04 00 01 06 e0 08 00:13:07.840 READ DMA SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 394 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
Jeremy Chadwick
2008-Feb-27 12:11 UTC
ad0 READ_DMA TIMEOUT errors on install of 7.0-RELEASE
On Wed, Feb 27, 2008 at 01:11:36AM -0800, Stephen Hurd wrote:> ... The corrupted sync message scared the heck out of me: > Waiting (max 60 seconds) for system process `vnlru' to stop...done > Waiti > Synncgi n(gm adxi sk6s0, svencoodnedss )r efmoari nsiynsgte.m. .pr1o0c ess > `syncer' to stop...8 7 8 3 3 3 1 0 0 0 0 donehttp://lists.freebsd.org/pipermail/freebsd-current/2007-October/078145.html http://lists.freebsd.org/pipermail/freebsd-current/2007-November/079130.html http://lists.freebsd.org/pipermail/freebsd-current/2007-November/079131.html http://lists.freebsd.org/pipermail/freebsd-stable/2007-December/038727.html> And after the reboot, the READ_DMA timeouts were back.You're not the only one seeing this behaviour. There are too many posts in the past reporting similar. Here's the breakdown: * Some reporting this problem have been told to replace their ATA or SATA cables (which have previously been known to be working, but cables going bad does happen) -- and this has fixed the problem for a couple. * Some have checked their SMART stats and found their disks to be in perfect condition. * Some have switched to alternate operating systems (usually Linux) for a short while and seen no sign of DMA timeouts. * Some have replaced the storage controller to no avail, and some have replaced the entire motherboard to no avail. In some cases (myself included), replacing the motherboard did in fact help. However: in your case, your disk does look to have problems based on the SMART output you provided. It does not matter how new/old the disk is, by the way. I'll point out the problematic stats. You need to replace the disk ASAP. BTW, any SMART stats you see labelled "Offline" means the numbers will not be updated until you perform an offline test (smartctl -t short or smartctl -t long).> The only "odd" think I can think of about my system is an unusually high HZ > value (2386) I'm building a kernel now with 1000 to check if that makes a > difference.This is not the cause, rest assured.> SMART Attributes Data Structure revision number: 16 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE > 5 Reallocated_Sector_Ct 0x0033 253 253 063 Pre-fail Always - 4This shows you've had 4 reallocated sectors, meaning your disk does in fact have bad blocks. In 90% of the cases out there, bad blocks continue to "grow" over time, due to whatever reason (I remember reading an article explaining it, but I can't for the life of me find the URL).> 194 Temperature_Celsius 0x0032 253 253 000 Old_age Always - 48This is excessive, and may be attributing to problems. A hard disk running at 48C is not a good sign. This should really be somewhere between high 20s and mid 30s.> 195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age Always - 11498This implies a large number of ECC (error correction) activities have occured, but all were successful.> Error 2 occurred at disk power-on lifetime: 5171 hours (215 days + 11 hours) > When the command that caused the error occurred, the device was in an unknown state. > Error 1 occurred at disk power-on lifetime: 5171 hours (215 days + 11 hours) > When the command that caused the error occurred, the device was in an unknown state.These are automated SMART log entries confirming the DMA failures. The fact that SMART saw them means that the disk is also aware of said issues. These may have been caused by the reallocated sectors. It's also interesting that the LBAs are different than the ones FreeBSD reported issues with. My advice to you is: replace the disk ASAP. This problem will only get worse. Try another hard disk brand too (I don't have anything "against" Maxtor, but usually its recommended to avoid a brand you have problems with until the next time you have issues, then switch brands, etc. etc...). I'm very fond of Western Digital's SE16, RE, and RE2 series currently. But avoid Fujitsu and Samsung (both have a long track record of having buggy drive firmwares, forcing vendors to make custom workarounds for issues); stick with Seagate, Western Digital, or Maxtor. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
I'm having very similar problems on my system that I just upgraded last night. It's a MiniITX board, model EN1200. My system can't remain up for more than 10minutes before something locks it up and frequently the screen will output error messages relating to DMA. I'd love to be able to help and diagnose this problem, but I'm unsure how to go about it, being relatively unacknowledged. The system was working perfectly fine on 6.3 up until I upgraded last night.
> last night. It's a MiniITX board, model EN1200. My system can't remain > up for more than 10minutes before something locks it up and frequently > the screen will output error messages relating to DMA.As a workaround, adding the line: hw.ata.ata_dma="0" To /boot/loader.conf will disable DMA and prevent the hangs that are caused by the DMA timeouts.
> > As a workaround, adding the line: > > hw.ata.ata_dma="0" > > > > To /boot/loader.conf will disable DMA and prevent the hangs that are caused by the DMA timeouts. > > Does that workaround work when the disks are sata?Don't know. I personally would assume so, but I wouldn't be surprised if my assumption was proven wrong either.
I'm also experiencing this issue after upgrading to 7.0-RELEASE from 6.3 I've got 2 Western Digital 5000YS hard drives in a GEOM Raid 1 configuration and connected to a Promise on-board SATA controller, this has worked flawlessly under 6.3 but since upgrading to 7.0 I'm getting the DMA timeouts under intense write activity. I get the errors below and then after having to hard-reset the mirror rebuilds from scratch :( I've pasted dmesg below if that is of any help. Really desperate for a solution here as I don't fancy reverting to 6.3, let me know if there is any other info I can provide that may help identify the problem. -Gianni Mar 21 17:50:07 kananga kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Mar 21 17:50:11 kananga kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Mar 21 17:50:15 kananga kernel: ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Mar 21 17:50:19 kananga kernel: ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Mar 21 17:50:23 kananga kernel: ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly Mar 21 17:50:23 kananga kernel: ad4: TIMEOUT - READ_DMA retrying (1 retry left) LBA=193407827 Mar 21 17:50:27 kananga kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Mar 21 17:50:31 kananga kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Mar 21 17:50:35 kananga kernel: ad6: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Mar 21 17:50:39 kananga kernel: ad6: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Mar 21 17:50:43 kananga kernel: ad6: WARNING - SET_MULTI taskqueue timeout - completing request directly Mar 21 17:50:43 kananga kernel: ad6: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=193297119 Mar 21 17:50:47 kananga kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Mar 21 17:50:51 kananga kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Mar 21 17:50:55 kananga kernel: ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Mar 21 17:50:59 kananga kernel: ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Mar 21 17:51:03 kananga kernel: ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly Mar 21 17:51:03 kananga kernel: ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=193297119 Mar 21 17:51:07 kananga kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Mar 21 17:51:11 kananga kernel: ad6: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly Mar 21 17:51:15 kananga kernel: ad6: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly Mar 21 17:51:19 kananga kernel: ad6: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.0-STABLE #2: Sun Mar 16 22:48:09 CET 2008 root@kananga.smersh.casa:/usr/obj/usr/src/sys/KANANGA Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2002.58-MHz K8- class CPU) Origin = "AuthenticAMD" Id = 0x20fb1 Stepping = 1 Features = 0x178bfbff < FPU ,VME ,DE ,PSE ,TSC ,MSR ,PAE ,MCE ,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> Features2=0x1<SSE3> AMD Features=0xe2500800<SYSCALL,NX,MMX+,FFXSR,LM,3DNow!+,3DNow!> AMD Features2=0x3<LAHF,CMP> Cores per package: 2 usable memory = 1063124992 (1013 MB) avail memory = 1024499712 (977 MB) ACPI APIC Table: <A M I OEMAPIC > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 MADT: Forcing active-low polarity and level trigger for SCI ioapic0 <Version 0.3> irqs 0-23 on motherboard kbd1 at kbdmux0 cryptosoft0: <software crypto> on motherboard acpi0: <A M I OEMXSDT> on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a0000 (3) failed acpi0: reservation of 100000, 3fef0000 (3) failed Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: <ACPI CPU> on acpi0 powernow0: <Cool`n'Quiet K8> on cpu0 cpu1: <ACPI CPU> on acpi0 powernow1: <Cool`n'Quiet K8> on cpu1 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 agp0: <VIA K8T800Pro host to PCI bridge> on hostb0 pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0 pci1: <ACPI PCI bus> on pcib1 vgapci0: <VGA-compatible display> port 0xe000-0xe0ff mem 0xe8000000-0xefffffff,0xfbe00000-0xfbe0ffff irq 16 at device 0.0 on pci1 vgapci1: <VGA-compatible display> mem 0xf0000000-0xf7ffffff, 0xfbf00000-0xfbf0ffff at device 0.1 on pci1 fwohci0: <VIA Fire II (VT6306)> port 0x8400-0x847f mem 0xfb300000-0xfb3007ff irq 16 at device 7.0 on pci0 fwohci0: [FILTER] fwohci0: OHCI version 1.0 (ROM=1) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:11:d8:00:00:1b:2a:01 fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: <IEEE1394(FireWire) bus> on fwohci0 dcons_crom0: <dcons configuration ROM> on firewire0 dcons_crom0: bus_addr 0x24e8000 sbp0: <SBP-2/SCSI over FireWire> on firewire0 fwohci0: Initiate bus reset fwohci0: BUS reset fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode atapci0: <Promise PDC20378 SATA150 controller> port 0x9400-0x943f, 0x9000-0x900f,0x8800-0x887f mem 0xfb500000-0xfb500fff, 0xfb400000-0xfb41ffff irq 18 at device 8.0 on pci0 atapci0: [ITHREAD] atapci0: [ITHREAD] ata2: <ATA channel 0> on atapci0 ata2: [ITHREAD] ata3: <ATA channel 1> on atapci0 ata3: [ITHREAD] ata4: <ATA channel 2> on atapci0 ata4: [ITHREAD] em0: <Intel(R) PRO/1000 Network Connection Version - 6.7.3> port 0x9800-0x983f mem 0xfb800000-0xfb81ffff,0xfb700000-0xfb71ffff irq 17 at device 12.0 on pci0 em0: Ethernet address: 00:0e:0c:ab:ad:42 em0: [FILTER] ahc0: <Adaptec 29160 Ultra160 SCSI adapter> port 0xa000-0xa0ff mem 0xfba00000-0xfba00fff irq 19 at device 14.0 on pci0 ahc0: [ITHREAD] aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs atapci1: <VIA 6420 SATA150 controller> port 0xc000-0xc007,0xb800-0xb803,0xb400-0xb407,0xb000-0xb003,0xa800-0xa80f, 0xa400-0xa4ff irq 20 at device 15.0 on pci0 atapci1: [ITHREAD] ata5: <ATA channel 0> on atapci1 ata5: [ITHREAD] ata6: <ATA channel 1> on atapci1 ata6: [ITHREAD] atapci2: <VIA 8237 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 15.1 on pci0 ata0: <ATA channel 0> on atapci2 ata0: [ITHREAD] ata1: <ATA channel 1> on atapci2 ata1: [ITHREAD] uhci0: <VIA 83C572 USB controller> port 0xc400-0xc41f irq 21 at device 16.0 on pci0 uhci0: [GIANT-LOCKED] uhci0: [ITHREAD] usb0: <VIA 83C572 USB controller> on uhci0 usb0: USB revision 1.0 uhub0: <VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0 uhub0: 2 ports with 2 removable, self powered uhci1: <VIA 83C572 USB controller> port 0xc800-0xc81f irq 21 at device 16.1 on pci0 uhci1: [GIANT-LOCKED] uhci1: [ITHREAD] usb1: <VIA 83C572 USB controller> on uhci1 usb1: USB revision 1.0 uhub1: <VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1 uhub1: 2 ports with 2 removable, self powered uhci2: <VIA 83C572 USB controller> port 0xd000-0xd01f irq 21 at device 16.2 on pci0 uhci2: [GIANT-LOCKED] uhci2: [ITHREAD] usb2: <VIA 83C572 USB controller> on uhci2 usb2: USB revision 1.0 uhub2: <VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb2 uhub2: 2 ports with 2 removable, self powered uhci3: <VIA 83C572 USB controller> port 0xd400-0xd41f irq 21 at device 16.3 on pci0 uhci3: [GIANT-LOCKED] uhci3: [ITHREAD] usb3: <VIA 83C572 USB controller> on uhci3 usb3: USB revision 1.0 uhub3: <VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb3 uhub3: 2 ports with 2 removable, self powered ehci0: <VIA VT6202 USB 2.0 controller> mem 0xfbc00000-0xfbc000ff irq 21 at device 16.4 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb4: waiting for BIOS to give up control usb4: EHCI version 1.0 usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 usb4: <VIA VT6202 USB 2.0 controller> on ehci0 usb4: USB revision 2.0 uhub4: <VIA EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb4 uhub4: 8 ports with 8 removable, self powered uhub5: <vendor 0x04cc product 0x1520, class 9/0, rev 2.00/2.00, addr 2> on uhub4 uhub5: single transaction translator uhub5: 3 ports with 2 removable, self powered ukbd0: <No brand KVM, class 0/0, rev 1.10/0.00, addr 3> on uhub5 kbd2 at ukbd0 ums0: <No brand KVM, class 0/0, rev 1.10/0.00, addr 3> on uhub5 ums0: 5 buttons and Z dir. isab0: <PCI-ISA bridge> at device 17.0 on pci0 isa0: <ISA bus> on isab0 pci0: <multimedia, audio> at device 17.5 (no driver attached) acpi_button0: <Power Button> on acpi0 acpi_button1: <Sleep Button> on acpi0 atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] fdc0: <floppy drive controller (FDE)> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FILTER] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 sio0: configured irq 3 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: configured irq 3 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 flags 0x10 on acpi0 sio0: type 16550A sio0: [FILTER] sio1: configured irq 4 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: configured irq 4 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 on acpi0 sio1: type 16550A sio1: [FILTER] orm0: <ISA Option ROMs> at iomem 0xc0000-0xccfff,0xcd000-0xd0fff, 0xd1000-0xd1fff on isa0 ppc0: cannot reserve I/O port range sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ugen0: <American Power Conversion Smart-UPS 1000 FW:600.3.I USB FW: 1.5, class 0/0, rev 1.10/0.06, addr 2> on uhub0 Timecounters tick every 1.000 msec Fast IPsec: Initialized Security Association Processing. firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) firewire0: bus manager 0 (me) acd0: DVDR <PHILIPS SPD2414T/P1.0> at ata0-master UDMA66 ad4: 476940MB <WDC WD5000YS-01MPB1 09.02E09> at ata2-master SATA150 ad6: 476940MB <WDC WD5000YS-01MPB1 09.02E09> at ata3-master SATA150 ad10: 114473MB <Seagate ST3120827AS 3.42> at ata5-master SATA150 ad12: 152627MB <Seagate ST3160827AS 3.42> at ata6-master SATA150 Waiting 5 seconds for SCSI devices to settle GEOM_MIRROR: Device mirror/gm2 launched (2/2). GEOM_MIRROR: Device mirror/gm1s1 launched (1/2). GEOM_MIRROR: Device gm1s1: rebuilding provider ad4s1. acd0: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00 acd0: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00 sa0 at ahc0 bus 0 target 15 lun 0 sa0: <SEAGATE DAT DAT72-000 A060> Removable Sequential Access SCSI-3 device sa0: 80.000MB/s transfers (40.000MHz, offset 32, 16bit) cd0 at ata0 bus 0 target 0 lun 0 cd0: <PHILIPSS MSPP:D 2A4P1 4CTP UP 1#.01> LRaeumnocvhaebdl!e CD-ROM SCSI-0 device cd0: 66.000MB/s transfers cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed Trying to mount root from ufs:/dev/mirror/gm2s1a WARNING: /home was not properly dismounted WARNING: /spare was not properly dismounted WARNING: /usr was not properly dismounted WARNING: /var was not properly dismounted WARNING: /data was not properly dismounted kqemu version 0x00010300 kqemu: KQEMU installed, max_locked_mem=519104kB.
Just to add my 5 cents to this topic: I had similar READ_DMA_TIMEOUT problems with a Promise Ultra100 controller with 2 PATA disks that caused various problems including memory corruption under heavy use. As soon as I replaced the controller with a cheap VIA 6421 controller all problems went away. One strange thing was that when only 1 PATA disk is attached there are no (obvious) problems. The problems start to appear only when more disks are attached to the controller.