robert mena
2010-Dec-27 19:04 UTC
[CentOS] Problems with motherboard support? INTEL DP43BF
Hi, I've installed Centos 5.5 (plus updates) in a machine with INTEL DP43BF motherboard. In order to make Linux detect the PCIs I've added the pci=assign-busses in my GRUB conf. Everything runs fine but within less than 2 days of uptime the machine simply freezes (black console no connectivity). This has happened more than one time so I'm considering to be a problem. The memtest passed without a problem and the machine uses a compact flash (sandisk extreme III 4GB) as a disk. I could only find the error messages in my /var/log/messages but those appear hours before the actual lock. kernel: 0000:00:1a.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001 kernel: 0000:00:1d.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001 kernel: eth4: PCI Bus error a290. kernel: eth4: PCI Bus error 0290. kernel: eth3: PCI Bus error 2290. kernel: eth3: PCI Bus error 0290. Any tips? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos/attachments/20101227/e2493d6b/attachment-0001.html>
Bart Schaefer
2010-Dec-27 20:33 UTC
[CentOS] Problems with motherboard support? INTEL DP43BF
On Mon, Dec 27, 2010 at 11:04 AM, robert mena <robert.mena at gmail.com> wrote:> Hi, > Everything runs fine but within less than 2 days of uptime the machine > simply freezes (black console no?connectivity). ?This has happened more than > one time so I'm considering to be a problem.What kind of CPU is in there? This sounds like what happens to some brands of CPUs when they overheat. Others just melt.
John R Pierce
2010-Dec-27 21:19 UTC
[CentOS] Problems with motherboard support? INTEL DP43BF
On 12/27/10 11:04 AM, robert mena wrote:> Hi, > > I've installed Centos 5.5 (plus updates) in a machine with INTEL > DP43BF motherboard. In order to make Linux detect the PCIs I've added > the pci=assign-busses in my GRUB conf. > > Everything runs fine but within less than 2 days of uptime the machine > simply freezes (black console no connectivity). This has happened > more than one time so I'm considering to be a problem. The memtest > passed without a problem and the machine uses a compact flash (sandisk > extreme III 4GB) as a disk. > > I could only find the error messages in my /var/log/messages but those > appear hours before the actual lock. > > kernel: 0000:00:1a.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001 > > kernel: 0000:00:1d.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001 > > > kernel: eth4: PCI Bus error a290. > > kernel: eth4: PCI Bus error 0290. > > kernel: eth3: PCI Bus error 2290. > > kernel: eth3: PCI Bus error 0290. > > > Any tips? >thats a desktop board, right? so it probably doesn't have ECC or any of the other system integrity features of a server board, nor do they usually have the IO bus bandwidth to handle substantial IO workloads. PCI bus errors are not a good thing at all, either. you have 5 ethernet adapters in use? what sort of Ethernet controller? I believe those PCI Bus errors are being reported by your ethernet adapters, and could be the result of excess bus contention. a single gigE can way more than saturate a 32bit 33Mhz PCI (parallel) bus. All the PCI slots on a desktop board like you have are on the same bus and contend for the same bandwidth. Also, as mentioned thermal problems are a definite possibility, although Intel CPUs tend to self-throttle if they get too hot, the Chipset might not be that good at it (eg, watch the chipset and memory temperature as well as the CPU). Another possible cause would be silent memory corruption although that would be more likely to cause a kernel fault ("Fatal kernel error - system halted") however if your display is in a GUI mode, you won't see this unless the console is directed to a serial port which is being monitored.