Has anyone experiencing this: twa0: ERROR: (0x05: 0x2018): Passthru request timed out!: request = 0xca839d20 twa0: INFO: (0x16: 0x1108): Resetting controller...: twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=0 ... twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=7 twa0: INFO: (0x04: 0x0001): Controller reset occurred: resets=1 twa0: INFO: (0x16: 0x1107): Controller reset done!: This happens on 6.2-PRERELEASE i386 (and on 6.1 since its release) on a number of machines with the following hardware configuration: - Tyan K8SE 2892, 2 AMD Opteron 270 CPUs, 4GB RAM - 3ware 9550SX-8LP, 8 500GB Seagate ST3500641AS SATA drives (configured as 8 SINGLE DISK units, aka JBOD) All hardware components, including the server chassis, are listed in the 3ware hardware compatibility lists. It doesn't seem to be a cabling or power issue. The controller and hard drives are already flashed to the latest firmware revisions. I tried turning off NCQ, but it didn't make any difference. I tried also switching the kernel from PAE to non-PAE (reducing the usable memory to 3GB), but it didn't help either. I have another machines with similar I/O configurations (3ware), but with Intel motherboards and running FreeBSD-5.5, and these run fine for about a year already. Now I'm thinking about swapping the drives between a working Intel and AMD based box, to see where controller timeouts will follow. The problem happens sporadically once in a month or so and is very hard to reproduce. Sometimes it takes several weeks until the next crash happens, sometimes it crashes again in just a few hours. When the thing happens, the kernel sometimes panics (most likely due to the inconsistent filesystem state caused by the controller reset), sometimes just hangs. It can be interrupted (I have a serial console), but the only usable thing after that seems to be "call cpu_reset()", followed by full (and sometimes painfully long) filesystem check. Here are the diffs against the default GENERIC and PAE kernel configurations: < cpu I486_CPU < ident GENERIC < options INET6 # IPv6 communications protocols < options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI > options QUOTA > options SMP # Symmetric MultiProcessor Kernel > options BREAK_TO_DEBUGGER > options DDB > options KDB > options KDB_UNATTENDED > options IPFIREWALL > options DUMMYNET I'm attaching the dmesg.boot following the latest crash. Regards, Atanas -------------- next part -------------- Copyright (c) 1992-2006 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 6.2-PRERELEASE #0: Mon Nov 13 17:47:40 PST 2006 root@xyz:/var/obj/usr/src/sys/XYZ-PAE Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Dual Core AMD Opteron(tm) Processor 270 (2009.27-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x20f12 Stepping = 2 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> Features2=0x1<SSE3> AMD Features=0xe2500800<SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow> AMD Features2=0x3<LAHF,CMP> Cores per package: 2 real memory = 5368709120 (5120 MB) avail memory = 4182241280 (3988 MB) ACPI APIC Table: <PTLTD APIC > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 ioapic0 <Version 1.1> irqs 0-23 on motherboard ioapic1 <Version 1.1> irqs 24-27 on motherboard ioapic2 <Version 1.1> irqs 28-31 on motherboard kbd1 at kbdmux0 acpi0: <PTLTD RSDT> on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x8008-0x800b on acpi0 cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 cpu2: <ACPI CPU> on acpi0 cpu3: <ACPI CPU> on acpi0 acpi_button0: <Power Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pci0: <memory> at device 0.0 (no driver attached) isab0: <PCI-ISA bridge> at device 1.0 on pci0 isa0: <ISA bus> on isab0 pci0: <serial bus, SMBus> at device 1.1 (no driver attached) pci0: <serial bus, USB> at device 2.0 (no driver attached) atapci0: <nVidia nForce CK804 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1400-0x140f at device 6.0 on pci0 ata0: <ATA channel 0> on atapci0 ata1: <ATA channel 1> on atapci0 pcib1: <ACPI PCI-PCI bridge> at device 9.0 on pci0 pci1: <ACPI PCI bus> on pcib1 pci1: <display, VGA> at device 6.0 (no driver attached) fxp0: <Intel 82551 Pro/100 Ethernet> port 0x2400-0x243f mem 0xda101000-0xda101fff,0xda120000-0xda13ffff irq 16 at device 8.0 on pci1 miibus0: <MII bus> on fxp0 inphy0: <i82555 10/100 media interface> on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp0: Ethernet address: 00:e0:81:33:b5:f1 pcib2: <ACPI PCI-PCI bridge> at device 13.0 on pci0 pci2: <ACPI PCI bus> on pcib2 pcib3: <ACPI PCI-PCI bridge> at device 14.0 on pci0 pci3: <ACPI PCI bus> on pcib3 pcib4: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci24: <ACPI PCI bus> on pcib4 pcib5: <ACPI PCI-PCI bridge> at device 10.0 on pci24 pci25: <ACPI PCI bus> on pcib5 3ware device driver for 9000 series storage controllers, version: 3.60.02.012 twa0: <3ware 9000 series Storage Controller> port 0x3000-0x303f mem 0xde000000-0xdfffffff,0xdc300000-0xdc300fff irq 27 at device 3.0 on pci25 twa0: [FAST] twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SX-8LP, 8 ports, Firmware FE9X 3.04.01.011, BIOS BE9X 3.04.00.002 pci24: <base peripheral, interrupt controller> at device 10.1 (no driver attached) pcib6: <ACPI PCI-PCI bridge> at device 11.0 on pci24 pci26: <ACPI PCI bus> on pcib6 bge0: <Broadcom BCM5704 A3, ASIC rev. 0x2003> mem 0xdc410000-0xdc41ffff,0xdc400000-0xdc40ffff irq 28 at device 9.0 on pci26 miibus1: <MII bus> on bge0 brgphy0: <BCM5704 10/100/1000baseTX PHY> on miibus1 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge0: Ethernet address: 00:e0:81:33:b6:f4 bge1: <Broadcom BCM5704 A3, ASIC rev. 0x2003> mem 0xdc430000-0xdc43ffff,0xdc420000-0xdc42ffff irq 29 at device 9.1 on pci26 miibus2: <MII bus> on bge1 brgphy1: <BCM5704 10/100/1000baseTX PHY> on miibus2 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge1: Ethernet address: 00:e0:81:33:b6:f5 pci24: <base peripheral, interrupt controller> at device 11.1 (no driver attached) atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A, console fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 pmtimer0 on isa0 orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff on isa0 ppc0: parallel port not found. sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding disabled, default to deny, logging disabled da0 at twa0 bus 0 target 0 lun 0 da0: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device da0: 100.000MB/s transfers da0: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) da1 at twa0 bus 0 target 1 lun 0 da1: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device da1: 100.000MB/s transfers da1: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) da2 at twa0 bus 0 target 2 lun 0 da2: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device da2: 100.000MB/s transfers da2: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) da3 at twa0 bus 0 target 3 lun 0 da3: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device da3: 100.000MB/s transfers da3: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) da4 at twa0 bus 0 target 4 lun 0 da4: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device da4: 100.000MB/s transfers da4: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) da5 at twa0 bus 0 target 5 lun 0 da5: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device da5: 100.000MB/s transfers da5: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) da6 at twa0 bus 0 target 6 lun 0 da6: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device da6: 100.000MB/s transfers da6: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) da7 at twa0 bus 0 target 7 lun 0 da7: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device da7: 100.000MB/s transfers da7: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) SMP: AP CPU #1 Launched! SMP: AP CPU #2 Launched! SMP: AP CPU #3 Launched! Trying to mount root from ufs:/dev/da0s1a WARNING: / was not properly dismounted /: mount pending error: blocks 208 files 5
adam radford
2006-Nov-14 19:59 UTC
twa: Passthru request timed out! Resetting controller...
Atanas, Are you running the latest 3ware firmware on that controller? -Adam On 11/14/06, Atanas <atanas@asd.aplus.net> wrote:> Has anyone experiencing this: > > twa0: ERROR: (0x05: 0x2018): Passthru request timed out!: request > 0xca839d20 > twa0: INFO: (0x16: 0x1108): Resetting controller...: > > twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=0 > > ... > twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=7 > > twa0: INFO: (0x04: 0x0001): Controller reset occurred: resets=1 > > twa0: INFO: (0x16: 0x1107): Controller reset done!: > > > This happens on 6.2-PRERELEASE i386 (and on 6.1 since its release) on a > number of machines with the following hardware configuration: > > - Tyan K8SE 2892, 2 AMD Opteron 270 CPUs, 4GB RAM > - 3ware 9550SX-8LP, 8 500GB Seagate ST3500641AS SATA drives > (configured as 8 SINGLE DISK units, aka JBOD) > > All hardware components, including the server chassis, are listed in the > 3ware hardware compatibility lists. It doesn't seem to be a cabling or > power issue. The controller and hard drives are already flashed to the > latest firmware revisions. I tried turning off NCQ, but it didn't make > any difference. I tried also switching the kernel from PAE to non-PAE > (reducing the usable memory to 3GB), but it didn't help either. > > I have another machines with similar I/O configurations (3ware), but > with Intel motherboards and running FreeBSD-5.5, and these run fine for > about a year already. Now I'm thinking about swapping the drives between > a working Intel and AMD based box, to see where controller timeouts will > follow. > > The problem happens sporadically once in a month or so and is very hard > to reproduce. Sometimes it takes several weeks until the next crash > happens, sometimes it crashes again in just a few hours. > > When the thing happens, the kernel sometimes panics (most likely due to > the inconsistent filesystem state caused by the controller reset), > sometimes just hangs. It can be interrupted (I have a serial console), > but the only usable thing after that seems to be "call cpu_reset()", > followed by full (and sometimes painfully long) filesystem check. > > Here are the diffs against the default GENERIC and PAE kernel > configurations: > > < cpu I486_CPU > < ident GENERIC > < options INET6 # IPv6 communications protocols > < options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI > > > options QUOTA > > options SMP # Symmetric MultiProcessor Kernel > > options BREAK_TO_DEBUGGER > > options DDB > > options KDB > > options KDB_UNATTENDED > > > options IPFIREWALL > > options DUMMYNET > > I'm attaching the dmesg.boot following the latest crash. > > Regards, > Atanas > > > > Copyright (c) 1992-2006 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 6.2-PRERELEASE #0: Mon Nov 13 17:47:40 PST 2006 > root@xyz:/var/obj/usr/src/sys/XYZ-PAE > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Dual Core AMD Opteron(tm) Processor 270 (2009.27-MHz 686-class CPU) > Origin = "AuthenticAMD" Id = 0x20f12 Stepping = 2 > Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > Features2=0x1<SSE3> > AMD Features=0xe2500800<SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow> > AMD Features2=0x3<LAHF,CMP> > Cores per package: 2 > real memory = 5368709120 (5120 MB) > avail memory = 4182241280 (3988 MB) > ACPI APIC Table: <PTLTD APIC > > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > cpu2 (AP): APIC ID: 2 > cpu3 (AP): APIC ID: 3 > ioapic0 <Version 1.1> irqs 0-23 on motherboard > ioapic1 <Version 1.1> irqs 24-27 on motherboard > ioapic2 <Version 1.1> irqs 28-31 on motherboard > kbd1 at kbdmux0 > acpi0: <PTLTD RSDT> on motherboard > acpi0: Power Button (fixed) > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x8008-0x800b on acpi0 > cpu0: <ACPI CPU> on acpi0 > cpu1: <ACPI CPU> on acpi0 > cpu2: <ACPI CPU> on acpi0 > cpu3: <ACPI CPU> on acpi0 > acpi_button0: <Power Button> on acpi0 > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pci0: <ACPI PCI bus> on pcib0 > pci0: <memory> at device 0.0 (no driver attached) > isab0: <PCI-ISA bridge> at device 1.0 on pci0 > isa0: <ISA bus> on isab0 > pci0: <serial bus, SMBus> at device 1.1 (no driver attached) > pci0: <serial bus, USB> at device 2.0 (no driver attached) > atapci0: <nVidia nForce CK804 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1400-0x140f at device 6.0 on pci0 > ata0: <ATA channel 0> on atapci0 > ata1: <ATA channel 1> on atapci0 > pcib1: <ACPI PCI-PCI bridge> at device 9.0 on pci0 > pci1: <ACPI PCI bus> on pcib1 > pci1: <display, VGA> at device 6.0 (no driver attached) > fxp0: <Intel 82551 Pro/100 Ethernet> port 0x2400-0x243f mem 0xda101000-0xda101fff,0xda120000-0xda13ffff irq 16 at device 8.0 on pci1 > miibus0: <MII bus> on fxp0 > inphy0: <i82555 10/100 media interface> on miibus0 > inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > fxp0: Ethernet address: 00:e0:81:33:b5:f1 > pcib2: <ACPI PCI-PCI bridge> at device 13.0 on pci0 > pci2: <ACPI PCI bus> on pcib2 > pcib3: <ACPI PCI-PCI bridge> at device 14.0 on pci0 > pci3: <ACPI PCI bus> on pcib3 > pcib4: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pci24: <ACPI PCI bus> on pcib4 > pcib5: <ACPI PCI-PCI bridge> at device 10.0 on pci24 > pci25: <ACPI PCI bus> on pcib5 > 3ware device driver for 9000 series storage controllers, version: 3.60.02.012 > twa0: <3ware 9000 series Storage Controller> port 0x3000-0x303f mem 0xde000000-0xdfffffff,0xdc300000-0xdc300fff irq 27 at device 3.0 on pci25 > twa0: [FAST] > twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SX-8LP, 8 ports, Firmware FE9X 3.04.01.011, BIOS BE9X 3.04.00.002 > pci24: <base peripheral, interrupt controller> at device 10.1 (no driver attached) > pcib6: <ACPI PCI-PCI bridge> at device 11.0 on pci24 > pci26: <ACPI PCI bus> on pcib6 > bge0: <Broadcom BCM5704 A3, ASIC rev. 0x2003> mem 0xdc410000-0xdc41ffff,0xdc400000-0xdc40ffff irq 28 at device 9.0 on pci26 > miibus1: <MII bus> on bge0 > brgphy0: <BCM5704 10/100/1000baseTX PHY> on miibus1 > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto > bge0: Ethernet address: 00:e0:81:33:b6:f4 > bge1: <Broadcom BCM5704 A3, ASIC rev. 0x2003> mem 0xdc430000-0xdc43ffff,0xdc420000-0xdc42ffff irq 29 at device 9.1 on pci26 > miibus2: <MII bus> on bge1 > brgphy1: <BCM5704 10/100/1000baseTX PHY> on miibus2 > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto > bge1: Ethernet address: 00:e0:81:33:b6:f5 > pci24: <base peripheral, interrupt controller> at device 11.1 (no driver attached) > atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 > atkbd0: <AT Keyboard> irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 > sio0: type 16550A, console > fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 > fdc0: [FAST] > fd0: <1440-KB 3.5" drive> on fdc0 drive 0 > pmtimer0 on isa0 > orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff on isa0 > ppc0: parallel port not found. > sc0: <System console> at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > sio1: configured irq 3 not in bitmap of probed irqs 0 > sio1: port may not be enabled > vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > Timecounters tick every 1.000 msec > ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding disabled, default to deny, logging disabled > da0 at twa0 bus 0 target 0 lun 0 > da0: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da0: 100.000MB/s transfers > da0: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da1 at twa0 bus 0 target 1 lun 0 > da1: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da1: 100.000MB/s transfers > da1: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da2 at twa0 bus 0 target 2 lun 0 > da2: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da2: 100.000MB/s transfers > da2: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da3 at twa0 bus 0 target 3 lun 0 > da3: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da3: 100.000MB/s transfers > da3: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da4 at twa0 bus 0 target 4 lun 0 > da4: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da4: 100.000MB/s transfers > da4: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da5 at twa0 bus 0 target 5 lun 0 > da5: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da5: 100.000MB/s transfers > da5: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da6 at twa0 bus 0 target 6 lun 0 > da6: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da6: 100.000MB/s transfers > da6: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da7 at twa0 bus 0 target 7 lun 0 > da7: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da7: 100.000MB/s transfers > da7: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > SMP: AP CPU #1 Launched! > SMP: AP CPU #2 Launched! > SMP: AP CPU #3 Launched! > Trying to mount root from ufs:/dev/da0s1a > WARNING: / was not properly dismounted > /: mount pending error: blocks 208 files 5 > > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > >
Mark Dotson
2006-Nov-14 22:30 UTC
twa: Passthru request timed out! Resetting controller...
I've had continued problems with the 3ware series SATA cards and the Tyan boards. Specifically, I have a "Tyan S5360-1U" and both a 9500S-4LP and a 8506 series 3ware cards. In my case the first error is different, but the 'resetting' over and over is VERY familiar. This could be triggered by a simple file copy from one part of a container to another; degrading the unit and triggering the resetting crap. Note that the drives are fine, I tested that first thing. Sep 8 11:59:23 localhost kernel: 3w-9xxx: scsi0: WARNING: (0x06:0x002C): Unit #1: Command (0x2a) timed out, resetting card. Sep 8 11:59:41 localhost kernel: 3w-9xxx: scsi0: AEN: INFO (0x04:0x005E): Cache synchronized after power fail:unit=0. Sep 8 11:59:41 localhost kernel: 3w-9xxx: scsi0: AEN: INFO (0x04:0x005E): Cache synchronized after power fail:unit=1. I also found this problem to exist across platforms, not just FreeBSD. For example, the excerpt above is from a CentOS box. All tests were done with newest firmware for both card and mobo, and using the newest drivers provided by 3ware. Once I removed the card and drives from the Tyan system and stuck them in pretty much ANY other system, they worked fantastically. I don't have an answer for the "resetting problem" as of yet... 3ware and Tyan (And my system vendor "Appro") are still trying to find my specific problem and solve it. I believe they are currently doing the "replace everything" method of troubleshooting. -Mark Atanas wrote:> Has anyone experiencing this: > > twa0: ERROR: (0x05: 0x2018): Passthru request timed out!: request = > 0xca839d20 > twa0: INFO: (0x16: 0x1108): Resetting controller...: > twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=0 > ... > twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=7 > twa0: INFO: (0x04: 0x0001): Controller reset occurred: resets=1 > twa0: INFO: (0x16: 0x1107): Controller reset done!: > > This happens on 6.2-PRERELEASE i386 (and on 6.1 since its release) on a > number of machines with the following hardware configuration: > > - Tyan K8SE 2892, 2 AMD Opteron 270 CPUs, 4GB RAM > - 3ware 9550SX-8LP, 8 500GB Seagate ST3500641AS SATA drives > (configured as 8 SINGLE DISK units, aka JBOD) > > All hardware components, including the server chassis, are listed in the > 3ware hardware compatibility lists. It doesn't seem to be a cabling or > power issue. The controller and hard drives are already flashed to the > latest firmware revisions. I tried turning off NCQ, but it didn't make > any difference. I tried also switching the kernel from PAE to non-PAE > (reducing the usable memory to 3GB), but it didn't help either. > > I have another machines with similar I/O configurations (3ware), but > with Intel motherboards and running FreeBSD-5.5, and these run fine for > about a year already. Now I'm thinking about swapping the drives between > a working Intel and AMD based box, to see where controller timeouts will > follow. > > The problem happens sporadically once in a month or so and is very hard > to reproduce. Sometimes it takes several weeks until the next crash > happens, sometimes it crashes again in just a few hours. > > When the thing happens, the kernel sometimes panics (most likely due to > the inconsistent filesystem state caused by the controller reset), > sometimes just hangs. It can be interrupted (I have a serial console), > but the only usable thing after that seems to be "call cpu_reset()", > followed by full (and sometimes painfully long) filesystem check. > > Here are the diffs against the default GENERIC and PAE kernel > configurations: > > < cpu I486_CPU > < ident GENERIC > < options INET6 # IPv6 communications protocols > < options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI > > > options QUOTA > > options SMP # Symmetric MultiProcessor Kernel > > options BREAK_TO_DEBUGGER > > options DDB > > options KDB > > options KDB_UNATTENDED > > > options IPFIREWALL > > options DUMMYNET > > I'm attaching the dmesg.boot following the latest crash. > > Regards, > Atanas > > > ------------------------------------------------------------------------ > > Copyright (c) 1992-2006 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 6.2-PRERELEASE #0: Mon Nov 13 17:47:40 PST 2006 > root@xyz:/var/obj/usr/src/sys/XYZ-PAE > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Dual Core AMD Opteron(tm) Processor 270 (2009.27-MHz 686-class CPU) > Origin = "AuthenticAMD" Id = 0x20f12 Stepping = 2 > Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > Features2=0x1<SSE3> > AMD Features=0xe2500800<SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow> > AMD Features2=0x3<LAHF,CMP> > Cores per package: 2 > real memory = 5368709120 (5120 MB) > avail memory = 4182241280 (3988 MB) > ACPI APIC Table: <PTLTD APIC > > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > cpu2 (AP): APIC ID: 2 > cpu3 (AP): APIC ID: 3 > ioapic0 <Version 1.1> irqs 0-23 on motherboard > ioapic1 <Version 1.1> irqs 24-27 on motherboard > ioapic2 <Version 1.1> irqs 28-31 on motherboard > kbd1 at kbdmux0 > acpi0: <PTLTD RSDT> on motherboard > acpi0: Power Button (fixed) > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x8008-0x800b on acpi0 > cpu0: <ACPI CPU> on acpi0 > cpu1: <ACPI CPU> on acpi0 > cpu2: <ACPI CPU> on acpi0 > cpu3: <ACPI CPU> on acpi0 > acpi_button0: <Power Button> on acpi0 > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pci0: <ACPI PCI bus> on pcib0 > pci0: <memory> at device 0.0 (no driver attached) > isab0: <PCI-ISA bridge> at device 1.0 on pci0 > isa0: <ISA bus> on isab0 > pci0: <serial bus, SMBus> at device 1.1 (no driver attached) > pci0: <serial bus, USB> at device 2.0 (no driver attached) > atapci0: <nVidia nForce CK804 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1400-0x140f at device 6.0 on pci0 > ata0: <ATA channel 0> on atapci0 > ata1: <ATA channel 1> on atapci0 > pcib1: <ACPI PCI-PCI bridge> at device 9.0 on pci0 > pci1: <ACPI PCI bus> on pcib1 > pci1: <display, VGA> at device 6.0 (no driver attached) > fxp0: <Intel 82551 Pro/100 Ethernet> port 0x2400-0x243f mem 0xda101000-0xda101fff,0xda120000-0xda13ffff irq 16 at device 8.0 on pci1 > miibus0: <MII bus> on fxp0 > inphy0: <i82555 10/100 media interface> on miibus0 > inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > fxp0: Ethernet address: 00:e0:81:33:b5:f1 > pcib2: <ACPI PCI-PCI bridge> at device 13.0 on pci0 > pci2: <ACPI PCI bus> on pcib2 > pcib3: <ACPI PCI-PCI bridge> at device 14.0 on pci0 > pci3: <ACPI PCI bus> on pcib3 > pcib4: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pci24: <ACPI PCI bus> on pcib4 > pcib5: <ACPI PCI-PCI bridge> at device 10.0 on pci24 > pci25: <ACPI PCI bus> on pcib5 > 3ware device driver for 9000 series storage controllers, version: 3.60.02.012 > twa0: <3ware 9000 series Storage Controller> port 0x3000-0x303f mem 0xde000000-0xdfffffff,0xdc300000-0xdc300fff irq 27 at device 3.0 on pci25 > twa0: [FAST] > twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SX-8LP, 8 ports, Firmware FE9X 3.04.01.011, BIOS BE9X 3.04.00.002 > pci24: <base peripheral, interrupt controller> at device 10.1 (no driver attached) > pcib6: <ACPI PCI-PCI bridge> at device 11.0 on pci24 > pci26: <ACPI PCI bus> on pcib6 > bge0: <Broadcom BCM5704 A3, ASIC rev. 0x2003> mem 0xdc410000-0xdc41ffff,0xdc400000-0xdc40ffff irq 28 at device 9.0 on pci26 > miibus1: <MII bus> on bge0 > brgphy0: <BCM5704 10/100/1000baseTX PHY> on miibus1 > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto > bge0: Ethernet address: 00:e0:81:33:b6:f4 > bge1: <Broadcom BCM5704 A3, ASIC rev. 0x2003> mem 0xdc430000-0xdc43ffff,0xdc420000-0xdc42ffff irq 29 at device 9.1 on pci26 > miibus2: <MII bus> on bge1 > brgphy1: <BCM5704 10/100/1000baseTX PHY> on miibus2 > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto > bge1: Ethernet address: 00:e0:81:33:b6:f5 > pci24: <base peripheral, interrupt controller> at device 11.1 (no driver attached) > atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 > atkbd0: <AT Keyboard> irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 > sio0: type 16550A, console > fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 > fdc0: [FAST] > fd0: <1440-KB 3.5" drive> on fdc0 drive 0 > pmtimer0 on isa0 > orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff on isa0 > ppc0: parallel port not found. > sc0: <System console> at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > sio1: configured irq 3 not in bitmap of probed irqs 0 > sio1: port may not be enabled > vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > Timecounters tick every 1.000 msec > ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding disabled, default to deny, logging disabled > da0 at twa0 bus 0 target 0 lun 0 > da0: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da0: 100.000MB/s transfers > da0: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da1 at twa0 bus 0 target 1 lun 0 > da1: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da1: 100.000MB/s transfers > da1: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da2 at twa0 bus 0 target 2 lun 0 > da2: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da2: 100.000MB/s transfers > da2: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da3 at twa0 bus 0 target 3 lun 0 > da3: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da3: 100.000MB/s transfers > da3: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da4 at twa0 bus 0 target 4 lun 0 > da4: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da4: 100.000MB/s transfers > da4: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da5 at twa0 bus 0 target 5 lun 0 > da5: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da5: 100.000MB/s transfers > da5: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da6 at twa0 bus 0 target 6 lun 0 > da6: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da6: 100.000MB/s transfers > da6: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da7 at twa0 bus 0 target 7 lun 0 > da7: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da7: 100.000MB/s transfers > da7: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > SMP: AP CPU #1 Launched! > SMP: AP CPU #2 Launched! > SMP: AP CPU #3 Launched! > Trying to mount root from ufs:/dev/da0s1a > WARNING: / was not properly dismounted > /: mount pending error: blocks 208 files 5 > > > ------------------------------------------------------------------------ > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"