Dear Sirs.
I reported very strange behaviours of FreeBSD 5.3 on a suspicous
hardware platform of mine and now I would like to repeat this and hope
someone can offer me some help. The reason why I repeat this suspected
bug is because I do not really beliefe in a hardware fault due to some
very strange behaviours of FreeBSD 5.3 on this box.
So, hope you can follow off my non-engeneer-English.
I utilize FreeBSD now for about 8 years on several plattforms especially
SMP plattforms as they became available for FreeBSD. These boxes were
uitilized for scientif-server-services and as high-performance-desktop
plattforms as FreeBSD was prior to 5.X as solid as a rock (with the
exeception of the start off of 4.0).
Today I work with my private plattform at my lab because our computer
center prefers Linux and I do not. So my hardware plattform seems to be
a little bit 'ancient', but I think a lot of ours still use these boxes
for high-duty tasks.
All right, here the facts.
Hardware:
Mainboard ASUS CUR-DLS, two 1GHz PIII, 2x 512MB ECC/reg, 3x SCSI U160
harddrives, 1x Intel 1000/Pro 64Bit PCI server 1GBit server NIC, DVD-RW
NEC ND3500AG/2.18 attached to ATA controler.
Please keep in mind that this mobo has a built in PCI VGA controler.
BIOS has been updated to the latest avaiable BIOS and after this hasn't
fixed any problem I updated to the latest BETA BIOS available for this
mobo.
After POST I get a summary screen and realized, that VGA controler is
not attached to an IRQ (and I suspect FreeBSD 5.3 having much troubles
with IRQ routing, but I will report more later on).
Firt thing I suspect courios:
Booting machine in single user mode for maintainance-purposes remains
box buggy! After a while of heavy screen output as done via compiling a
kernel or building world or 'find'-ing all files of the file system or
showing the contents of a file via more or similar freezes the box/the
screen. I can then type 'return' and watch the blank lines filled in,
but there is nothing more, the box seems to be stuck. Only rescue is a
reboot. This happens to all variants of booting off single user mode,
SMP enabled/disabled in the kernel, apic enabled/disabled, acpi
enabled/disabled. The only way to get rid of this is to plugg in a
separate PCI VGA card into another slot!! Then machine boots correctly
into single user mode - but dies immediately when booting off multi-user
mode anyway. With UP and multiuser mode, this box is with X11 GUI (Xorg
or XFree86) very stable (using built in VGA).
Next harsh problem is plugging in sound- or VGA-cards. It seems to be
highly dependend on which slot such cards get plugged in. Different
sound cards, different PCI VGA cards - same problem. FreeBSD 5.3 starts
off and then freezes after init of the SCSI controller. The same thing
happens when disabling both serial ports! FreeBSD dies after init of the
SCSI controler.
The mobo has two 64 Bit PCI slots and I use one of them for the Intel
1000/Pro GBit NIC. At the now choosen slot usage of a sound card is
impossible (most near 64 bit slot to the CPUs). FreeBSD dies on every
combination (w/ or w/o SCPI) I tried.
SMP is impossible, w/ or w/o ACPI. FreeBSD 5.3 dies after a while of
doing graphical output with the X11 GUI. I stressed the
machine today with buildworld and build kernel and building openoffice
1.1.4 with console output and for 15 hours the box was stable as a rock.
But switching to the X11 GUI doing some FireFox jobs (simply surfing the
www) let the system die within minutes. Sometimes I can 'feel' when a
crash is arising, the box is a kind of 'calm' and I can switch to the
console sometimes catching the error message from the debugger, but
sometimes not. This let me suspect the operating system 'waiting' for
something related to the second CPU or similar and waiting forever. I'm
not familiar with kernel programming and development, so my report seems
to be a bit of 'strange', sorry for that.
Another couriosity is booting the box in 'safe mode', I thing APIC is
off, SMP is off and acpi is off. The box becomes slow, grapical output
'hangs' for seconds, freezes and defreezes and the system remains a kind
of 'not useable'.
The utilized mobo ASUS CUR-DLS uses the RCC LE 3.0 champion chipset. The
IRQ problems (using of PCI cards in any PCI slot impossible) are similar
to problems I had in the past with FreeBSD 5.0 on a TYAN Thunder 2500
box, were I wasn't able plugging the AMI RAID controler in any of the 6
64Bit slots. FreeBSD got stuck at the same place: when the built in SCSI
controler (LSI logic 1010-33 or 894-33) get initialised. This makes me
very courios.
Now I build akernel with debug options and I will try to catch a kernel
dump. maybe someone of yours is interested in that. I will attach
mptabel -dmesg output, hope it is of your convenience.
Oliver
-------------- next part --------------
==============================================================================
MPTable, version 2.0.15
-------------------------------------------------------------------------------
MP Floating Pointer Structure:
location: BIOS
physical address: 0x000f5270
signature: '_MP_'
length: 16 bytes
version: 1.4
checksum: 0xe3
mode: Virtual Wire
-------------------------------------------------------------------------------
MP Config Table Header:
physical address: 0x000f4e60
signature: 'PCMP'
base table length: 276
version: 1.4
checksum: 0x12
OEM ID: 'OEM00000'
Product ID: 'PROD00000000'
OEM table pointer: 0x00000000
OEM table size: 0
entry count: 26
local APIC address: 0xfee00000
extended table length: 124
extended table checksum: 198
-------------------------------------------------------------------------------
MP Config Base Table Entries:
--
Processors: APIC ID Version State Family Model Step Flags
3 0x11 BSP, usable 6 8 6 0x387fbff
0 0x11 AP, usable 6 8 6 0x387fbff
--
Bus: Bus ID Type
0 PCI
1 PCI
2 ISA
--
I/O APICs: APIC ID Version State Address
2 0x11 usable 0xfec00000
3 0x11 usable 0xfec01000
--
I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN#
ExtINT conforms conforms 2 0 2 0
INT conforms conforms 2 1 2 1
INT conforms conforms 2 0 2 2
INT conforms conforms 2 3 2 3
INT conforms conforms 2 4 2 4
INT conforms conforms 2 6 2 6
INT conforms conforms 2 7 2 7
INT conforms conforms 2 8 2 8
INT conforms conforms 2 12 2 12
INT conforms conforms 2 13 2 13
INT conforms conforms 2 14 2 14
INT conforms conforms 2 15 2 15
INT active-lo level 0 15:A 3 14
INT active-lo level 2 9 2 9
INT active-lo level 1 2:A 3 5
INT active-lo level 1 5:A 3 8
INT active-lo level 1 5:B 3 9
--
Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN#
ExtINT active-hi edge 2 0 255 0
NMI active-hi edge 2 0 255 1
-------------------------------------------------------------------------------
MP Config Extended Table Entries:
--
System Address Space
bus ID: 0 address type: I/O address
address base: 0x0
address range: 0x10000
--
System Address Space
bus ID: 0 address type: memory address
address base: 0x40000000
address range: 0xbebe0000
--
System Address Space
bus ID: 0 address type: prefetch address
address base: 0xfebe0000
address range: 0xe9420000
--
System Address Space
bus ID: 0 address type: memory address
address base: 0xe8000000
address range: 0x18000000
--
System Address Space
bus ID: 0 address type: memory address
address base: 0xa0000
address range: 0x20000
--
Bus Heirarchy
bus ID: 2 bus info: 0x01 parent bus ID: 0
--
Compatibility Bus Address
bus ID: 0 address modifier: add
predefined range: 0x00000000
--
Compatibility Bus Address
bus ID: 0 address modifier: add
predefined range: 0x00000001
-------------------------------------------------------------------------------
dmesg output:
Copyright (c) 1992-2005 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.3-STABLE #35: Sun Jan 16 17:27:11 UTC 2005
root@edda.physik.uni-mainz.de:/usr/obj/usr/src/sys/EDDA
ACPI APIC Table: <ASUS CUR-DLS >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel Pentium III (1000.04-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x686 Stepping = 6
Features=0x387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE>
real memory = 1073721344 (1023 MB)
avail memory = 1041166336 (992 MB)
ioapic0 <Version 1.1> irqs 0-15 on motherboard
ioapic1 <Version 1.1> irqs 16-31 on motherboard
netsmb_dev: loaded
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
acpi0: <ASUS CUR-DLS> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
acpi_timer0: <32-bit timer at 3.579545MHz> port 0xe408-0xe40b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pci0: <display, VGA> at device 7.0 (no driver attached)
isab0: <PCI-ISA bridge> port 0xe800-0xe80f at device 15.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <ServerWorks ROSB4 UDMA33 controller> port
0xd400-0xd40f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 15.1 on pci0
ata0: channel #0 on atapci0
ata1: channel #1 on atapci0
ohci0: <OHCI (generic) USB controller> mem 0xfc000000-0xfc000fff irq 9 at
device 15.2 on pci0
ohci0: [GIANT-LOCKED]
usb0: OHCI version 1.0, legacy support
usb0: <OHCI (generic) USB controller> on ohci0
usb0: USB revision 1.0
uhub0: (0x1166) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 4 ports with 4 removable, self powered
ugen0: OmniVision OV511+ Camera, rev 1.00/1.00, addr 2
pcib1: <ACPI Host-PCI bridge> on acpi0
pci1: <ACPI PCI bus> on pcib1
em0: <Intel(R) PRO/1000 Network Connection, Version - 1.7.35> port
0xd000-0xd03f mem 0xfb800000-0xfb81ffff irq 21 at device 2.0 on pci1
em0: Ethernet address: 00:07:e9:14:8f:7b
em0: Speed:N/A Duplex:N/A
sym0: <1010-33> port 0xb800-0xb8ff mem
0xfa800000-0xfa801fff,0xfb000000-0xfb0003ff irq 24 at device 5.0 on pci1
sym0: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking
sym0: open drain IRQ line driver, using on-chip SRAM
sym0: using LOAD/STORE-based firmware.
sym0: handling phase mismatch from SCRIPTS.
sym0: [GIANT-LOCKED]
sym1: <1010-33> port 0xb400-0xb4ff mem
0xf9800000-0xf9801fff,0xfa000000-0xfa0003ff irq 25 at device 5.1 on pci1
sym1: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking
sym1: open drain IRQ line driver, using on-chip SRAM
sym1: using LOAD/STORE-based firmware.
sym1: handling phase mismatch from SCRIPTS.
sym1: [GIANT-LOCKED]
atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse, device ID 3
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on
acpi0
sio0: type 16550A
sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
ppc0: <ECP parallel printer port> port 0x778-0x77a,0x378-0x37f irq 7 drq 3
on acpi0
ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
ppbus0: <Parallel port bus> on ppc0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
fdc0: <floppy drive controller> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on
acpi0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
orm0: <ISA Option ROMs> at iomem 0xc8000-0xcbfff,0xc0000-0xc7fff on isa0
pmtimer0 on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <8 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
fb0 at vga0
Timecounter "TSC" frequency 1000040215 Hz quality 800
Timecounters tick every 1.250 msec
Fast IPsec: Initialized Security Association Processing.
acd0: DVDR <NEC DVD RW ND-3500AG/2.18> at ata0-master UDMA33
Waiting 3 seconds for SCSI devices to settle
(noperiph:sym0:0:-1:-1): SCSI BUS reset delivered.
(noperiph:sym1:0:-1:-1): SCSI BUS reset delivered.
da0 at sym0 bus 0 target 0 lun 0
da0: <IBM IC35L018UWD210-0 S5BS> Fixed Direct Access SCSI-3 device
da0: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing
Enabled
da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da1 at sym0 bus 0 target 1 lun 0
da1: <IBM DDYS-T18350N S96H> Fixed Direct Access SCSI-3 device
da1: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing
Enabled
da1: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da2 at sym0 bus 0 target 2 lun 0
da2: <FUJITSU MAJ3182MP 5207> Fixed Direct Access SCSI-3 device
da2: 160.000MB/s transfers (80.000MHz, offset 62, 16bit), Tagged Queueing
Enabled
da2: 17429MB (35694904 512 byte sectors: 255H 63S/T 2221C)
cd0 at ata0 bus 0 target 0 lun 0
cd0: <_NEC DVD_RW ND-3500AG 2.18> Removable CD-ROM SCSI-0 device
cd0: 33.000MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present
GEOM_LABEL: Label for provider da0s1d is ufs/var.
GEOM_LABEL: Label for provider da0s1e is ufs/compat.
GEOM_LABEL: Label for provider da0s1f is ufs/src.
GEOM_LABEL: Label for provider da0s1g is ufs/usr.
GEOM_LABEL: Label for provider da0s1h is ufs/local.
GEOM_LABEL: Label for provider da1s1d is ufs/obj.
GEOM_LABEL: Label for provider da1s1e is ufs/ports.
GEOM_LABEL: Label for provider da1s1f is ufs/scratch.
GEOM_LABEL: Label for provider da1s1g is ufs/data.
Mounting root from ufs:/dev/da0s1a
GEOM_LABEL: Label for provider da0s1e is ufs/compat.
GEOM_LABEL: Label for provider da1s1g is ufs/data.
GEOM_LABEL: Label for provider da1s1d is ufs/obj.
GEOM_LABEL: Label for provider da0s1g is ufs/usr.
GEOM_LABEL: Label for provider da1s1e is ufs/ports.
GEOM_LABEL: Label for provider da1s1f is ufs/scratch.
GEOM_LABEL: Label for provider da0s1h is ufs/local.
GEOM_LABEL: Label for provider da0s1f is ufs/src.
GEOM_LABEL: Label for provider da0s1d is ufs/var.
pflog0: promiscuous mode enabled
em0: Link is up 100 Mbps Full Duplex
===============================================================================