It's taken me a while to narrow down what this is, but today I finally
narrowed it all the way down. High network load on the system causes it
to hard lock, nothing but pulling the plug will get any response. The
network interface is nve on an Asus A8N-SLI. The magic bullet appears to be:
bit torrent downloading/seeding at least two torrents. Doesn't matter
what client your using. I've done this using Azureus and Ktorrent both.
FTP'ing something (either direction) from the box. I've gone so far as
to throttle the ftp client to 300K/s, and it will still do it.
Things worth noting: I've narrowed this down by doing stupid things to
try to make it crash, such as building world+3 or 4 other large things
at once, moving large files between disks, etc. Many things have
triggered this (NFS activity, etc) but the only common thread I found
was network activity, since it's done this with and without NFS running
(I wanted to eleminate NFS since it seems to be a bit unstable at the
moment) doing a multitude of tasks. The network cable connecting this
system to the switch is perfect. The switch rarely shows any collisions
unless network load is high on this box, then the collision light will
come on nearly constantly.
dmesg:
root@colossus(~)# dmesg
Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 6.1-STABLE #6: Sat Sep 2 04:56:20 CDT 2006
lauasanf@colossus.cotharyus.net:/usr/obj/usr/src/sys/Colossus
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2010.31-MHz
686-class CPU)
Origin = "AuthenticAMD" Id = 0x20fb1 Stepping = 1
Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
Features2=0x1<SSE3>
AMD Features=0xe2500800<SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow>
AMD Features2=0x3<LAHF,CMP>
Cores per package: 2
real memory = 1073676288 (1023 MB)
avail memory = 1037369344 (989 MB)
ACPI APIC Table: <Nvidia AWRDACPI>
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
cpu0 (BSP): APIC ID: 0
cpu1 (AP): APIC ID: 1
ioapic0: Changing APIC ID to 2
ioapic0 <Version 1.1> irqs 0-23 on motherboard
acpi0: <Nvidia AWRDACPI> on motherboard
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
acpi0: Power Button (fixed)
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
acpi_bus_number: can't get _ADR
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pci0: <memory> at device 0.0 (no driver attached)
isab0: <PCI-ISA bridge> at device 1.0 on pci0
isa0: <ISA bus> on isab0
pci0: <serial bus, SMBus> at device 1.1 (no driver attached)
ohci0: <OHCI (generic) USB controller> mem 0xdb102000-0xdb102fff irq 21
at device 2.0 on pci0
ohci0: [GIANT-LOCKED]
usb0: OHCI version 1.0, legacy support
usb0: SMM does not respond, resetting
usb0: <OHCI (generic) USB controller> on ohci0
usb0: USB revision 1.0
uhub0: nVidia OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 10 ports with 10 removable, self powered
ehci0: <NVIDIA nForce4 USB 2.0 controller> mem 0xfeb00000-0xfeb000ff irq
22 at device 2.1 on pci0
ehci0: [GIANT-LOCKED]
usb1: EHCI version 1.0
usb1: companion controller, 4 ports each: usb0
usb1: <NVIDIA nForce4 USB 2.0 controller> on ehci0
usb1: USB revision 2.0
uhub1: nVidia EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub1: 10 ports with 10 removable, self powered
pcm0: <nVidia nForce4> port 0xd400-0xd4ff,0xd800-0xd8ff mem
0xdb101000-0xdb101fff irq 23 at device 4.0 on pci0
pcm0: <Avance Logic ALC850 AC97 Codec>
atapci0: <nVidia nForce CK804 UDMA133 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at device 6.0 on pci0
ata0: <ATA channel 0> on atapci0
ata1: <ATA channel 1> on atapci0
pcib1: <ACPI PCI-PCI bridge> at device 9.0 on pci0
pci5: <ACPI PCI bus> on pcib1
fwohci0: <Texas Instruments TSB43AB22/A> mem
0xdb004000-0xdb0047ff,0xdb000000-0xdb003fff irq 16 at device 11.0 on pci5
fwohci0: OHCI version 1.10 (ROM=1)
fwohci0: No. of Isochronous channels is 4.
fwohci0: EUI64 00:11:d8:00:00:86:18:47
fwohci0: Phy 1394a available S400, 2 ports.
fwohci0: Link S400, max_rec 2048 bytes.
firewire0: <IEEE1394(FireWire) bus> on fwohci0
sbp0: <SBP-2/SCSI over FireWire> on firewire0
fwohci0: Initiate bus reset
fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode
firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me)
firewire0: bus manager 0 (me)
nve0: <NVIDIA nForce MCP9 Networking Adapter> port 0xd000-0xd007 mem
0xdb100000-0xdb100fff irq 21 at device 10.0 on pci0
nve0: Ethernet address 00:15:f2:7f:80:86
miibus0: <MII bus> on nve0
ukphy0: <Generic IEEE 802.3u media interface> on miibus0
ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto
nve0: Ethernet address: 00:15:f2:7f:80:86
pcib2: <ACPI PCI-PCI bridge> at device 11.0 on pci0
pci4: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> at device 12.0 on pci0
pci3: <ACPI PCI bus> on pcib3
pcib4: <ACPI PCI-PCI bridge> at device 13.0 on pci0
pci2: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> at device 14.0 on pci0
pci1: <ACPI PCI bus> on pcib5
nvidia0: <GeForce 6800> mem
0xd8000000-0xd8ffffff,0xd0000000-0xd7ffffff,0xd9000000-0xd9ffffff irq 18
at device 0.0 on pci1
nvidia0: [GIANT-LOCKED]
acpi_tz0: <Thermal Zone> on acpi0
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on
acpi0
fdc0: [FAST]
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on
acpi0
sio0: type 16550A
ppc0: <ECP parallel printer port> port 0x378-0x37f,0x778-0x77b irq 7 drq
3 on acpi0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/16 bytes threshold
ppbus0: <Parallel port bus> on ppc0
plip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem 0xc0000-0xcefff,0xd0000-0xd3fff on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
uhid0: Logitech Inc. WingMan Force 3D, rev 1.00/1.06, addr 2, iclass 3/0
ukbd0: Microsoft Natural\M-. Ergonomic Keyboard 4000, rev 2.00/1.73,
addr 3, iclass 3/1
kbd1 at ukbd0
uhid1: Microsoft Natural\M-. Ergonomic Keyboard 4000, rev 2.00/1.73,
addr 3, iclass 3/1
umass0: Generic USB Storage Device, rev 1.10/1.00, addr 4
ums0: Microsoft Microsoft Trackball Optical\M-., rev 1.10/1.21, addr 5,
iclass 3/1
ums0: 5 buttons and Z dir.
uhid2: Jess Tech GGE909 PC Recoil Pad, rev 1.10/1.01, addr 6, iclass 3/0
Timecounters tick every 1.000 msec
ad0: 43979MB <IBM DTLA-307045 TX6OA6AA> at ata0-master UDMA100
ad1: 190782MB <Seagate ST3200822A 3.01> at ata0-slave UDMA100
ad2: 38166MB <WDC WD400BB-22CAA0 16.06V16> at ata1-master UDMA100
acd0: DVDR <LITE-ON DVDRW SOHW-832S/VS01> at ata1-slave UDMA33
SMP: AP CPU #1 Launched!
cd0 at ata1 bus 0 target 1 lun 0
cd0: <LITE-ON DVDRW SOHW-832S VS01> Removable CD-ROM SCSI-0 device
cd0: 33.000MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present
da0 at umass-sim0 bus 0 target 0 lun 0
da0: <Lexar USB Storage-SMC 0180> Removable Direct Access SCSI-0 device
da0: 1.000MB/s transfers
da0: Attempt to query device size failed: NOT READY, Medium not present
da1 at umass-sim0 bus 0 target 0 lun 1
da1: <Lexar USB Storage-CFC 0180> Removable Direct Access SCSI-0 device
da1: 1.000MB/s transfers
da1: Attempt to query device size failed: NOT READY, Medium not present
da2 at umass-sim0 bus 0 target 0 lun 2
da2: <Lexar USB Storage-MMC 0180> Removable Direct Access SCSI-0 device
da2: 1.000MB/s transfers
da2: 489MB (1002497 512 byte sectors: 64H 32S/T 489C)
da3 at umass-sim0 bus 0 target 0 lun 3
da3: <Lexar USB Storage-MSC 0180> Removable Direct Access SCSI-0 device
da3: 1.000MB/s transfers
da3: Attempt to query device size failed: NOT READY, Medium not present
(da0:umass-sim0:0:0:0): READ CAPACITY. CDB: 25 0 0 0 0 0 0 0 0 0
(da0:umass-sim0:0:0:0): CAM Status: SCSI Status Error
(da0:umass-sim0:0:0:0): SCSI Status: Check Condition
(da0:umass-sim0:0:0:0): NOT READY asc:3a,0
(da0:umass-sim0:0:0:0): Medium not present
(da0:umass-sim0:0:0:0): Unretryable error
Opened disk da0 -> 6
(da0:umass-sim0:0:0:0): READ CAPACITY. CDB: 25 0 0 0 0 0 0 0 0 0
(da0:umass-sim0:0:0:0): CAM Status: SCSI Status Error
(da0:umass-sim0:0:0:0): SCSI Status: Check Condition
(da0:umass-sim0:0:0:0): NOT READY asc:3a,0
(da0:umass-sim0:0:0:0): Medium not present
(da0:umass-sim0:0:0:0): Unretryable error
Opened disk da0 -> 6
(da0:umass-sim0:0:0:0): READ CAPACITY. CDB: 25 0 0 0 0 0 0 0 0 0
(da0:umass-sim0:0:0:0): CAM Status: SCSI Status Error
(da0:umass-sim0:0:0:0): SCSI Status: Check Condition
(da0:umass-sim0:0:0:0): NOT READY asc:3a,0
(da0:umass-sim0:0:0:0): Medium not present
(da0:umass-sim0:0:0:0): Unretryable error
Opened disk da0 -> 6
(da1:umass-sim0:0:0:1): READ CAPACITY. CDB: 25 20 0 0 0 0 0 0 0 0
(da1:umass-sim0:0:0:1): CAM Status: SCSI Status Error
(da1:umass-sim0:0:0:1): SCSI Status: Check Condition
(da1:umass-sim0:0:0:1): NOT READY asc:3a,0
(da1:umass-sim0:0:0:1): Medium not present
(da1:umass-sim0:0:0:1): Unretryable error
Opened disk da1 -> 6
(da1:umass-sim0:0:0:1): READ CAPACITY. CDB: 25 20 0 0 0 0 0 0 0 0
(da1:umass-sim0:0:0:1): CAM Status: SCSI Status Error
(da1:umass-sim0:0:0:1): SCSI Status: Check Condition
(da1:umass-sim0:0:0:1): NOT READY asc:3a,0
(da1:umass-sim0:0:0:1): Medium not present
(da1:umass-sim0:0:0:1): Unretryable error
Opened disk da1 -> 6
(da1:umass-sim0:0:0:1): READ CAPACITY. CDB: 25 20 0 0 0 0 0 0 0 0
(da1:umass-sim0:0:0:1): CAM Status: SCSI Status Error
(da1:umass-sim0:0:0:1): SCSI Status: Check Condition
(da1:umass-sim0:0:0:1): NOT READY asc:3a,0
(da1:umass-sim0:0:0:1): Medium not present
(da1:umass-sim0:0:0:1): Unretryable error
Opened disk da1 -> 6
(da3:umass-sim0:0:0:3): READ CAPACITY. CDB: 25 60 0 0 0 0 0 0 0 0
(da3:umass-sim0:0:0:3): CAM Status: SCSI Status Error
(da3:umass-sim0:0:0:3): SCSI Status: Check Condition
(da3:umass-sim0:0:0:3): NOT READY asc:3a,0
(da3:umass-sim0:0:0:3): Medium not present
(da3:umass-sim0:0:0:3): Unretryable error
Opened disk da3 -> 6
(da3:umass-sim0:0:0:3): READ CAPACITY. CDB: 25 60 0 0 0 0 0 0 0 0
(da3:umass-sim0:0:0:3): CAM Status: SCSI Status Error
(da3:umass-sim0:0:0:3): SCSI Status: Check Condition
(da3:umass-sim0:0:0:3): NOT READY asc:3a,0
(da3:umass-sim0:0:0:3): Medium not present
(da3:umass-sim0:0:0:3): Unretryable error
Opened disk da3 -> 6
(da3:umass-sim0:0:0:3): READ CAPACITY. CDB: 25 60 0 0 0 0 0 0 0 0
(da3:umass-sim0:0:0:3): CAM Status: SCSI Status Error
(da3:umass-sim0:0:0:3): SCSI Status: Check Condition
(da3:umass-sim0:0:0:3): NOT READY asc:3a,0
(da3:umass-sim0:0:0:3): Medium not present
(da3:umass-sim0:0:0:3): Unretryable error
Opened disk da3 -> 6
Trying to mount root from ufs:/dev/ad0s1a
I'll answer any questions I can about this, or provide further
information if needed. I'm interested in seeing if anyone else can
confirm or reproduce this, or if it's a known problem.