Hi
I have problems with a combination of
Mainboard: Intel Serverboard SE7221BK1-E, ICH 6 Chipset, Bios Version P06
HDD: WD 4000YR, no raid
FreeBSD 6.2-STABLE-200702
(boot messages see further down)
>From time to time (every 2-4 weeks) the server hangs without any message on
the console or in log. It's not a kernel panic, the system freezes and there
is no reaction from the server with the exception that the kernel debugger
runs.
It seems that the ata driver is hanging in an interrupt event and don't know
what to do.
Is there anybody who can give additional information about that?
----------------------------------------------------------------------------
db> bt
Tracing pid 24 tid 100020 td 0xc637a480
kdb_enter(c072a8e2,e4f91bc8,c0,c637a480,c64ac400,...) at kdb_enter+0x30
siointr1(c64ac400,c64b60c0,c636f4c8,e4f91bec,c06d1799,...) at siointr1+0xd1
siointr(c64ac400,c6370000,e4f91bec,0,c637a480,...) at siointr+0x42
intr_execute_handlers(c636f4c8,e4f91c04,e4f91c7c,c06cdb33,37,...) at
intr_execute_handlers+0xfa
lapic_handle_intr(37) at lapic_handle_intr+0x3b
Xapic_isr1() at Xapic_isr1+0x33
--- interrupt, eip = 0xc044c3ab, esp = 0xe4f91c48, ebp = 0xe4f91c7c ---
ata_ahci_status(c64a4880,c63ffd38,c63ffc90,e4f91ce0,c0530521,...) at
ata_ahci_status+0x57
ata_interrupt(c6479c00,c64a0cc0,4,e4f91ce0,c050ede2,...) at
ata_interrupt+0x68
ata_generic_intr(c6489600,c637a480,f18bb,f2539122,c637a480,...) at
ata_generic_intr+0x25
ithread_execute_handlers(c63ffc90,c6376300,c63ffc90,c637a480,c63ffc90,...)
at ithread_execute_handlers+0x15e
ithread_loop(c6462170,e4f91d38,ffffffff,ffffffff,ffffffff,...) at
ithread_loop+0x63
fork_exit(c050eec0,c6462170,e4f91d38) at fork_exit+0x7a
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe4f91d6c, ebp = 0 ---
----------------------------------------------------------------------------
Doing some alltrace, go out of the debugger and reenter:
----------------------------------------------------------------------------
...
Tracing command init pid 1 tid 100007 td 0xc6379480
sched_switch(c6379480,0,1,11dd38b4,8d0595a4,...) at sched_switch+0x158
mi_switch(1,0) at mi_switch+0x1d4
sleepq_switch(c637f000,c6379480,0,e4f73c2c,c052ffcb,...) at
sleepq_switch+0x91
sleepq_wait_sig(c637f000,5c,c07179db,100,c649c648,...) at
sleepq_wait_sig+0x21
msleep(c637f000,c637f068,15c,c07179db,0,...) at msleep+0x288
kern_wait(c6379480,ffffffff,e4f73c78,0,0,...) at kern_wait+0xb10
wait4(c6379480,e4f73d04,10,1a031,0,...) at wait4+0x3c
syscall(3b,3b,bfbf003b,2,bfbfeef8,...) at syscall+0x34a
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (7, FreeBSD ELF32, wait4), eip = 0x8054197, esp = 0xbfbfed6c,
ebp = 0xbfbfed88 ---
Tracing command swapper pid 0 tid 0 td 0xc076f500
sched_switch(c076f500,0,1,b624e016,aa96b765,...) at sched_switch+0x158
mi_switch(1,0,0,0,0,...) at mi_switch+0x1d4
scheduler(0,c1e000,c1ec00,c1e000,0,...) at scheduler+0x224
mi_startup() at mi_startup+0xa0
begin() at begin+0x2c
----------------------------------------------------------------------------
db> continue
~KDB: enter: Line break on console
[thread pid 24 tid 100020 ]
Stopped at kdb_enter+0x30: leave
----------------------------------------------------------------------------
db> bt
Tracing pid 24 tid 100020 td 0xc637a480
kdb_enter(c072a8e2,0,0,c637a480,c64ac400,...) at kdb_enter+0x30
siointr1(c64ac400,c64b60c0,c636f4c8,e4f91c80,c06d1799,...) at siointr1+0xd1
siointr(c64ac400,e4f91c74,c053cba3,0,c637a480,...) at siointr+0x42
intr_execute_handlers(c636f4c8,e4f91c98,e4f91ce0,c06cdb33,37,...) at
intr_execute_handlers+0xfa
lapic_handle_intr(37) at lapic_handle_intr+0x3b
Xapic_isr1() at Xapic_isr1+0x33
--- interrupt, eip = 0xc06d781f, esp = 0xe4f91cdc, ebp = 0xe4f91ce0 ---
spinlock_exit(1,0,c63ffc90,c637a480,c63ffc90,...) at spinlock_exit+0x28
ithread_loop(c6462170,e4f91d38,ffffffff,ffffffff,ffffffff,...) at
ithread_loop+0xf4
fork_exit(c050eec0,c6462170,e4f91d38) at fork_exit+0x7a
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe4f91d6c, ebp = 0 ---
----------------------------------------------------------------------------
Trying do boot, but the hdd hangs in a timeout loop
----------------------------------------------------------------------------
db> call boot
Waiting (max 60 seconds) for system process `vnlru' to stop...
done
Waiting (max 60 seconds) for system process `bufdaemon' to stop...
FreeBSD/i386 em1: watchdog timeout -- resetting
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing
request directly
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing
request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing
request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing
request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=100029696
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing
request directly
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing
request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing
request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing
request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=100029760
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing
request directly
timed out
Waiting (max 60 seconds) for system process `syncer' to stop...ad4: WARNING
- SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request
directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing
request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing
request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=100029696
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing
request directly
ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing
request directly
ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing
request directly
ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing
request directly
ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad4: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=100029760
...
----------------------------------------------------------------------------
REBOOT over IPMI
----------------------------------------------------------------------------
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2007 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 6.2-STABLE-200702 #11: Sun Jul 15 21:17:16 CEST 2007
ACPI APIC Table: <A M I OEMAPIC >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (2992.52-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0xf43 Stepping = 3
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA
,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Features2=0x649d<SSE3,RSVD2,MON,DS_CPL,EST,CNTX-ID,CX16,<b14>>
AMD Features=0x20000000<LM>
Logical CPUs per core: 2
real memory = 2138984448 (2039 MB)
avail memory = 2088189952 (1991 MB)
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 24-47 on motherboard
ioapic2 <Version 2.0> irqs 48-71 on motherboard
kbd1 at kbdmux0
acpi0: <A M I OEMRSDT> on motherboard
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
acpi0: Power Button (fixed)
acpi0: reservation of 500, 10 (4) failed
acpi0: reservation of 560, 20 (4) failed
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_throttle0: <ACPI CPU Throttling> on cpu0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pci0: <display, VGA> at device 2.0 (no driver attached)
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
pci2: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> at device 0.0 on pci2
pci4: <ACPI PCI bus> on pcib2
em0: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port
0xef80-0xefbf mem 0xdffe0000-0xdfffffff irq 27 at device 3.0 on pci4
em0: Ethernet address: 00:0e:0c:4a:a7:fd
pcib3: <ACPI PCI-PCI bridge> at device 0.2 on pci2
pci3: <ACPI PCI bus> on pcib3
uhci0: <Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-A> port
0xcc00-0xcc1f irq 23 at device 29.0 on pci0
uhci0: [GIANT-LOCKED]
usb0: <Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-A> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: <Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-B> port
0xcc80-0xcc9f irq 19 at device 29.1 on pci0
uhci1: [GIANT-LOCKED]
usb1: <Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-B> on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: <Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-C> port
0xcd00-0xcd1f irq 18 at device 29.2 on pci0
uhci2: [GIANT-LOCKED]
usb2: <Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-C> on uhci2
usb2: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
ehci0: <Intel 82801FB (ICH6) USB 2.0 controller> mem 0xdfdff800-0xdfdffbff
irq 23 at device 29.7 on pci0
ehci0: [GIANT-LOCKED]
usb3: EHCI version 1.0
usb3: companion controllers, 2 ports each: usb0 usb1 usb2
usb3: <Intel 82801FB (ICH6) USB 2.0 controller> on ehci0
usb3: USB revision 2.0
uhub3: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub3: 6 ports with 6 removable, self powered
pcib4: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci1: <ACPI PCI bus> on pcib4
em1: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port
0xdf80-0xdfbf mem 0xdfee0000-0xdfefffff irq 18 at device 3.0 on pci1
em1: Ethernet address: 00:0e:0c:4a:a7:fc
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH6 UDMA100 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376 at device 31.1 on pci0
ata0: <ATA channel 0> on atapci0
ata1: <ATA channel 1> on atapci0
atapci1: <Intel ICH6 SATA150 controller> port
0xcf80-0xcf87,0xcf00-0xcf03,0xce80-0xce87,0xce00-0xce03,0xcd80-0xcd8f mem
0xdfdffc00-0xdfdfffff irq 19 at device 31.2 on pci0
atapci1: AHCI Version 01.00 controller with 4 ports detected
ata2: <ATA channel 0> on atapci1
ata3: <ATA channel 1> on atapci1
ata4: <ATA channel 2> on atapci1
ata5: <ATA channel 3> on atapci1
ichsmb0: <SMBus controller> port 0x400-0x41f irq 19 at device 31.3 on pci0
ichsmb0: [GIANT-LOCKED]
smbus0: <System Management Bus> on ichsmb0
ipmi0: <IPMI System Interface> on smbus0
ipmi0: SSIF mode found at address 0x42 on smbus
acpi_button0: <Power Button> on acpi0
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on
acpi0
sio0: type 16550A, console
fdc0: <floppy drive controller (FDE)> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2
on
acpi0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem
0xc9800-0xca7ff,0xca800-0xcb7ff,0xdc000-0xdffff on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
ppc0: parallel port not found.
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounter "TSC" frequency 2992519478 Hz quality 800
Timecounters tick every 1.000 msec
em0: link state changed to UP
em1: link state changed to UP
acd0: DVDROM <HL-DT-STDVD-ROM GDR8163B/0L23> at ata0-master UDMA33
ad4: 381554MB <WDC WD4000YR-01PLB0 01.06A01> at ata2-master SATA150
ad6: 381554MB <WDC WD4000YR-01PLB0 01.06A01> at ata3-master SATA150
ad8: 381554MB <WDC WD4000YR-01PLB0 01.06A01> at ata4-master SATA150
ipmi0: IPMI device rev. 1, firmware rev. 2.81, version 1.5
ipmi0: Number of channels 0
ipmi0: Attached watchdog
Trying to mount root from ufs:/dev/ad4s2a
WARNING: / was not properly dismounted
Loading configuration files.
kernel dumps on /dev/ad4s2b
Entropy harvesting: interrupts ethernet point_to_point kickstart.
swapon: adding /dev/ad4s2b as swap device
Starting file system checks:
/dev/ad4s2a: 3492 files, 72574 used, 940441 free (2689 frags, 117219 blocks,
0.3% fragmentation)
/dev/ad6s2a: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad6s2a: clean, 940441 free (497 frags, 117493 blocks, 0.0%
fragmentation)
/dev/ad8s1d: DEFER FOR BACKGROUND CHECKING
/dev/ad4s2d: DEFER FOR BACKGROUND CHECKING
/dev/ad4s2e: DEFER FOR BACKGROUND CHECKING
/dev/ad4s2f: DEFER FOR BACKGROUND CHECKING
/dev/ad4s3p1: DEFER FOR BACKGROUND CHECKING
/dev/ad4s3p2: DEFER FOR BACKGROUND CHECKING
/dev/ad4s3p3: DEFER FOR BACKGROUND CHECKING
/dev/ad4s3p11: DEFER FOR BACKGROUND CHECKING
/dev/ad4s3p10: DEFER FOR BACKGROUND CHECKING
/dev/ad4s3p4: DEFER FOR BACKGROUND CHECKING
/dev/ad4s3p5: DEFER FOR BACKGROUND CHECKING
/dev/ad4s3p12: DEFER FOR BACKGROUND CHECKING
/dev/ad4s3p13: DEFER FOR BACKGROUND CHECKING
/dev/ad6s2d: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad6s2d: clean, 5009816 free (15968 frags, 624231 blocks, 0.2%
fragmentation)
/dev/ad6s2e: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad6s2e: clean, 1437962 free (8354 frags, 178701 blocks, 0.3%
fragmentation)
/dev/ad6s2f: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad6s2f: clean, 1011948 free (156 frags, 126474 blocks, 0.0%
fragmentation)
/dev/ad6s3p1: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad6s3p1: clean, 1384097 free (2433 frags, 172708 blocks, 0.2%
fragmentation)
/dev/ad6s3p2: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad6s3p2: clean, 11232980 free (14020 frags, 1402370 blocks, 0.1%
fragmentation)
/dev/ad6s3p3: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad6s3p3: clean, 9642063 free (15775 frags, 1203286 blocks, 0.1%
fragmentation)
/dev/ad6s3p11: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad6s3p11: clean, 772504 free (6136 frags, 95796 blocks, 0.4%
fragmentation)
/dev/ad6s3p10: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad6s3p10: clean, 1298386 free (5226 frags, 161645 blocks, 0.3%
fragmentation)
/dev/ad6s3p4: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad6s3p4: clean, 2140124 free (5428 frags, 266837 blocks, 0.2%
fragmentation)
/dev/ad6s3p5: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad6s3p5: clean, 776311 free (4455 frags, 96482 blocks, 0.4%
fragmentation)
/dev/ad6s3p12: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad6s3p12: clean, 14682517 free (5149 frags, 1834671 blocks, 0.0%
fragmentation)
/dev/ad6s3p13: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad6s3p13: clean, 7697255 free (6631 frags, 961328 blocks, 0.1%
fragmentation)
Mounting local file systems:WARNING: /usr was not properly dismounted