On Wed, Jun 22, 2011 at 05:52:39PM +0300, George Kontostanos
wrote:> This is the 3rd disk I replace in 3 disk- Raiz1 pool and I really start to
> believe that the problem is somewhere else. The disks reside in a Promise
> PDC40718 SATA300 controller. I am running this set up since 8.0-Release
with
> no issues till a few months ago after 8.2-Release now at 8.2-Stable.
> Symptoms:
>
> Jun 22 17:08:53 hp kernel: ata2: timeout waiting to issue command
> Jun 22 17:08:53 hp kernel: ata2: error issuing SETFEATURES ENABLE WCACHE
> command
> Jun 22 17:09:33 hp kernel: ad4: WARNING - SET_MULTI taskqueue timeout -
> completing request directly
> Jun 22 17:09:33 hp kernel: ad4: WARNING - WRITE_DMA48 requeued due to
> channel reset LBA=321558741
> Jun 22 17:09:34 hp kernel: ata2: SIGNATURE: 00000101
> Jun 22 17:09:34 hp kernel: ad4: WARNING - WRITE_DMA48 requeued due to
> channel reset LBA=321558869
> Jun 22 17:09:34 hp kernel: ata2: FAILURE - already active DMA on this
device
> Jun 22 17:09:34 hp kernel: ata2: setting up DMA failed
> Jun 22 17:09:34 hp kernel: ata2: FAILURE - already active DMA on this
device
> Jun 22 17:09:34 hp kernel: ata2: setting up DMA failed
>
>
> After a while the disk gets detached from the pool. Always the same disk.
> Rite now I am in the process of resilvering :
>
> pool: tank
> state: ONLINE
> status: One or more devices is currently being resilvered. The pool will
> continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
> scan: resilver in progress since Wed Jun 22 17:09:40 2011
> 189G scanned out of 578G at 88.8M/s, 1h14m to go
> 62.9G resilvered, 32.63% done
> config:
>
> NAME STATE READ WRITE CKSUM
> tank ONLINE 0 0 0
> raidz1-0 ONLINE 0 0 0
> label/zdisk1 ONLINE 0 0 0
> label/zdisk2 ONLINE 0 0 0
> label/zdisk3 ONLINE 0 0 0 (resilvering)
>
> But those errors have started to appear again. Again this is the 3rd disk
> replaced !!! Full dmesg attached
>
> --
> George Kontostanos
> aisecure.net <http://www.aisecure.net>
> Copyright (c) 1992-2011 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
> The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 8.2-STABLE #0: Mon Jun 6 19:00:19 EEST 2011
> gkontos@hp.aicom.loc:/usr/obj/usr/src/sys/ML110G3 amd64
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: Intel(R) Pentium(R) D CPU 3.20GHz (3200.13-MHz K8-class CPU)
> Origin = "GenuineIntel" Id = 0xf64 Family = f Model = 6
Stepping = 4
>
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
>
Features2=0xe4bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,CNXT-ID,CX16,xTPR,PDCM>
> AMD Features=0x20100800<SYSCALL,NX,LM>
> AMD Features2=0x1<LAHF>
> TSC: P-state invariant
> real memory = 4294967296 (4096 MB)
> avail memory = 4106780672 (3916 MB)
> ACPI APIC Table: <HP OEMAPIC >
> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
> FreeBSD/SMP: 1 package(s) x 2 core(s)
> cpu0 (BSP): APIC ID: 0
> cpu1 (AP): APIC ID: 1
> ioapic0: Changing APIC ID to 2
> ioapic0 <Version 2.0> irqs 0-23 on motherboard
> kbd1 at kbdmux0
> acpi0: <HP OEMXSDT> on motherboard
> acpi0: [ITHREAD]
> acpi0: Power Button (fixed)
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
> cpu0: <ACPI CPU> on acpi0
> cpu1: <ACPI CPU> on acpi0
> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> pci0: <ACPI PCI bus> on pcib0
> pcib1: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
> pci1: <ACPI PCI bus> on pcib1
> pcib2: <ACPI PCI-PCI bridge> irq 17 at device 28.5 on pci0
> pci7: <ACPI PCI bus> on pcib2
> bge0: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev.
0x004101> mem 0xfeaf0000-0xfeafffff irq 17 at device 0.0 on pci7
> bge0: CHIP ID 0x00004101; ASIC REV 0x04; CHIP REV 0x41; PCI-E
> miibus0: <MII bus> on bge0
> brgphy0: <BCM5750 10/100/1000baseTX PHY> PHY 1 on miibus0
> brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> bge0: Ethernet address: 00:13:21:cc:39:35
> bge0: [ITHREAD]
> uhci0: <Intel 82801G (ICH7) USB controller USB-A> port 0xdc00-0xdc1f
irq 23 at device 29.0 on pci0
> uhci0: [ITHREAD]
> uhci0: LegSup = 0x2f00
> usbus0: <Intel 82801G (ICH7) USB controller USB-A> on uhci0
> uhci1: <Intel 82801G (ICH7) USB controller USB-B> port 0xd880-0xd89f
irq 19 at device 29.1 on pci0
> uhci1: [ITHREAD]
> uhci1: LegSup = 0x2f00
> usbus1: <Intel 82801G (ICH7) USB controller USB-B> on uhci1
> uhci2: <Intel 82801G (ICH7) USB controller USB-C> port 0xd800-0xd81f
irq 18 at device 29.2 on pci0
> uhci2: [ITHREAD]
> uhci2: LegSup = 0x2f00
> usbus2: <Intel 82801G (ICH7) USB controller USB-C> on uhci2
> ehci0: <Intel 82801GB/R (ICH7) USB 2.0 controller> mem
0xfe9ffc00-0xfe9fffff irq 23 at device 29.7 on pci0
> ehci0: [ITHREAD]
> usbus3: EHCI version 1.0
> usbus3: <Intel 82801GB/R (ICH7) USB 2.0 controller> on ehci0
> pcib3: <ACPI PCI-PCI bridge> at device 30.0 on pci0
> pci8: <ACPI PCI bus> on pcib3
> atapci0: <Promise PDC40718 SATA300 controller> port
0xec00-0xec7f,0xe800-0xe8ff mem 0xfebff000-0xfebfffff,0xfebc0000-0xfebdffff irq
16 at device 0.0 on pci8
> atapci0: [ITHREAD]
> atapci0: [ITHREAD]
> ata2: <ATA channel 0> on atapci0
> ata2: SIGNATURE: 00000101
> ata2: [ITHREAD]
> ata3: <ATA channel 1> on atapci0
> ata3: SIGNATURE: 00000101
> ata3: [ITHREAD]
> ata4: <ATA channel 2> on atapci0
> ata4: [ITHREAD]
> ata5: <ATA channel 3> on atapci0
> ata5: SIGNATURE: 00000101
> ata5: [ITHREAD]
> vgapci0: <VGA-compatible display> port 0xe000-0xe0ff mem
0xe8000000-0xefffffff,0xfebb0000-0xfebbffff irq 16 at device 2.0 on pci8
> isab0: <PCI-ISA bridge> at device 31.0 on pci0
> isa0: <ISA bus> on isab0
> atapci1: <Intel ICH7 UDMA100 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0
> ata0: <ATA channel 0> on atapci1
> ata0: [ITHREAD]
> ahci0: <Intel ICH7 AHCI SATA controller> port
0xd480-0xd487,0xd400-0xd403,0xd080-0xd087,0xd000-0xd003,0xcc00-0xcc0f mem
0xfe9ff800-0xfe9ffbff irq 19 at device 31.2 on pci0
> ahci0: [ITHREAD]
> ahci0: AHCI v1.10 with 4 3Gbps ports, Port Multiplier not supported
> ahcich0: <AHCI channel> at channel 0 on ahci0
> ahcich0: [ITHREAD]
> ahcich1: <AHCI channel> at channel 1 on ahci0
> ahcich1: [ITHREAD]
> ahcich2: <AHCI channel> at channel 2 on ahci0
> ahcich2: [ITHREAD]
> ahcich3: <AHCI channel> at channel 3 on ahci0
> ahcich3: [ITHREAD]
> acpi_button0: <Power Button> on acpi0
> atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
> uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on
acpi0
> uart0: [FILTER]
> uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
> uart1: [FILTER]
> ppc0: <Parallel port> port 0x378-0x37f,0x778-0x77f irq 7 drq 3 on
acpi0
> ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
> ppc0: FIFO with 16/16/8 bytes threshold
> ppc0: [ITHREAD]
> ppbus0: <Parallel port bus> on ppc0
> plip0: <PLIP network interface> on ppbus0
> plip0: [ITHREAD]
> lpt0: <Printer> on ppbus0
> lpt0: [ITHREAD]
> lpt0: Interrupt-driven port
> ppi0: <Parallel I/O> on ppbus0
> orm0: <ISA Option ROMs> at iomem
0xc0000-0xc8fff,0xc9000-0xcdfff,0xcf800-0xd47ff,0xd4800-0xd57ff on isa0
> sc0: <System console> at flags 0x100 on isa0
> sc0: VGA <16 virtual consoles, flags=0x300>
> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on
isa0
> atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
> atkbd0: <AT Keyboard> irq 1 on atkbdc0
> kbd0 at atkbd0
> atkbd0: [GIANT-LOCKED]
> atkbd0: [ITHREAD]
> est0: <Enhanced SpeedStep Frequency Control> on cpu0
> est: CPU supports Enhanced Speedstep, but is not recognized.
> est: cpu_vendor GenuineIntel, msr 102400001024
> device_attach: est0 attach returned 6
> p4tcc0: <CPU Frequency Thermal Control> on cpu0
> est1: <Enhanced SpeedStep Frequency Control> on cpu1
> est: CPU supports Enhanced Speedstep, but is not recognized.
> est: cpu_vendor GenuineIntel, msr 102400001024
> device_attach: est1 attach returned 6
> p4tcc1: <CPU Frequency Thermal Control> on cpu1
> ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is
present;
> to enable, add "vfs.zfs.prefetch_disable=0" to
/boot/loader.conf.
> ZFS filesystem version 5
> ZFS storage pool version 28
> Timecounters tick every 1.000 msec
> usbus0: 12Mbps Full Speed USB v1.0
> usbus1: 12Mbps Full Speed USB v1.0
> usbus2: 12Mbps Full Speed USB v1.0
> usbus3: 480Mbps High Speed USB v2.0
> ugen0.1: <Intel> at usbus0
> uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on
usbus0
> ugen1.1: <Intel> at usbus1
> uhub1: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on
usbus1
> ugen2.1: <Intel> at usbus2
> uhub2: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on
usbus2
> ugen3.1: <Intel> at usbus3
> uhub3: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on
usbus3
> ad4: 715404MB <WDC WD7500AALX-009BA0 15.01H15> at ata2-master UDMA100
SATA 3Gb/s
> ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
> ada0: <ST3250410AS 3.AAA> ATA-7 SATA 1.x device
> ada0: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes)
> ada0: Command Queueing enabled
> ada0: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)
> ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
> ada1: <ST3250410AS 3.AAA> ATA-7 SATA 1.x device
> ada1: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes)
> ada1: Command Queueing enabled
> ada1: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)ad6: 610480MB
<WDC WD6401AALS-00J7B1 05.00K05> at ata3-master UDMA100 SATA 3Gb/s
>
> ad10: 610480MB <WDC WD6401AALS-00J7B1 05.00K05> at ata5-master
UDMA100 SATA 3Gb/s
> SMP: AP CPU #1 Launched!
> Root mount waiting for: usbus3 usbus2 usbus1 usbus0
> uhub0: 2 ports with 2 removable, self powered
> uhub1: 2 ports with 2 removable, self powered
> uhub2: 2 ports with 2 removable, self powered
> Root mount waiting for: usbus3
> uhub3: 6 ports with 6 removable, self powered
> Root mount waiting for: usbus3
> ugen3.2: <Seagate> at usbus3
> umass0: <Seagate FreeAgent Go, class 0/0, rev 2.00/1.38, addr 2> on
usbus3
> umass0: SCSI over Bulk-Only; quirks = 0x0000
> ugen0.2: <American Power Conversion> at usbus0
> umass0:4:0:-1: Attached to scbus4
> Trying to mount root from zfs:zroot
> da0 at umass-sim0 bus 0 scbus4 target 0 lun 0
> da0: <Seagate FreeAgent Go 0138> Fixed Direct Access SCSI-4 device
> da0: 40.000MB/s transfers
> da0: 610480MB (1250263726 512 byte sectors: 255H 63S/T 77825C)
> bge0: link state changed to UP
> S
> log_sysevent: type 19 is not implemented
> ata2: SIGNATURE: ffffffff
> ata2: timeout waiting to issue command
> ata2: error issuing SETFEATURES SET TRANSFER MODE command
> ata2: timeout waiting to issue command
> ata2: error issuing SETFEATURES ENABLE RCACHE command
> ata2: timeout waiting to issue command
> ata2: error issuing SETFEATURES ENABLE WCACHE command
> ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
> ad4: WARNING - WRITE_DMA48 requeued due to channel reset LBA=321558741
> ata2: SIGNATURE: 00000101
> ad4: WARNING - WRITE_DMA48 requeued due to channel reset LBA=321558869
> ata2: FAILURE - already active DMA on this device
> ata2: setting up DMA failed
> ata2: FAILURE - already active DMA on this device
> ata2: setting up DMA failed
George,
Can you please install ports/sysutils/smartmontools (should be version
5.41; if you have an older version please upgrade) and provide output
from the following comman
smartctl -a /dev/ad4
With this I should be able to rule out weird disk problems. It's always
good to start there.
For those unable to parse the above topology, the system has two SATA
controllers (the Promise uses ata(4), while the on-board ICH7 is in AHCI
mode and is using ahci.ko (AHCI-to-CAM)):
atapci0 = Promise PDC40718 (Promise SATA300 TX4)
--> ata2-master = ad4 = WDC WD7500AALX-009BA0
--> ata2-slave = <empty>
--> ata3-master = ad6 = WDC WD6401AALS-00J7B1
--> ata3-slave = <empty>
--> ata4-master = <empty>
--> ata4-slave = <empty>
--> ata5-master = ad10 = WDC WD6401AALS-00J7B1
--> ata5-slave = <empty>
ahci0 = Intel ICH7 on-board in AHCI mode
--> ahcich0 = ada0 = ST3250410AS 3.AAA
--> ahcich1 = ada1 = ST3250410AS 3.AAA
--> ahcich2 = <empty>
--> ahcich3 = <empty>
If you can't get this situation solved, I'd recommend spending $40
(pocket change) to invest in a Silicon Image 3124 card. Your existing
Promise controller is a PCI card (not PCIe or PCI-X), and I don't know
if your motherboard has any PCIe or PCI-X slots, so I'm going to assume
the 133MByte/sec limitation is acceptable to you. As such, that limits
you to effectively this card:
http://www.newegg.com/Product/Product.aspx?Item=N82E16816132017
You do not have to use the RAID functionality of the card. FreeBSD
supports this card using siis(4) and it does utilise CAM, so your disks
would show up as adaX. The driver is actively supported/maintained.
Avoid looking at cards which use the 3112, 3114, or 3512 chips.
Hope this helps, or at least directs you in a path that lets you solve
the problem through a little bit of money.
--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, US |
| Making life hard for others since 1977. PGP 4BD6C0CB |