A brand new system setup with RH 7.3 is still Oopsing after every X hours.
The problem started after about 1 week without errors.
Sometimes everything is locked, sometimes still messages are written to
/var/log/messages.
Sometimes other processes keep running (can still ping the machine)
I upgraded the kernel to 2.4.18-19.7.x, changed the memory, but still got
problems.
When I read all sort of newsgroups, I read it can be almost everything.
Is 'downgrading' to ext2 an option (or even a solution)?
Problems with swap:
Swap: 522072K av, 0K used, 522072K free
(strange not 1 bit used)
--> How can I put something in the swap?, so I can rule this out
Hardware:
I read something about memory caching to write to the HD
--> It looks to me the crashed occur when the CPU/HD is trashed (they come in
both)
How can I stress-test these systems.
HD:
The harddisk is brandnew, so it would be strange to me that after 1 week
it's already being bad.
--> how can I rule this out?
IDE-bus?
Got a strang sentence at startup about the speed....
System: AMD Athlon 1150 (1 CPU), with Maxtor HD
At startup I see this, take is the next sentence OK?
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller on PCI bus 00 dev 39
VP_IDE: chipset revision 6
VP_IDE: not 100%% native mode: will probe irqs later
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci00:07.1
ide0: BM-DMA at 0xc000-0xc007, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xc008-0xc00f, BIOS settings: hdc:pio, hdd:pio
hda: Maxtor 6E030L0, ATA DISK drive
At panic-time in /var/log/messages
Unable to handle kernel NULL pointer dereference at virtual address 00000015
printing eip:
c0117289
*pde = 00000000
Oops: 0002
autofs tulip 8139too mii ipchains ide-cd cdrom usb-uhci usbcore ext3 jbd
CPU: 0
EIP: 0010:[<c0117289>] Not tainted
EFLAGS: 00010202
EIP is at copy_files [kernel] 0x169 (2.4.18-19.7.x)
eax: 00000001 ebx: cd37ac84 ecx: c7bcf0a0 edx: 00000338
esi: cd37ab4c edi: cac7decc ebp: cac7de40 esp: ca5fdf44
ds: 0018 es: 0018 ss: 0018
Process httpd (pid: 650, stackpage=ca5fd000)
Stack: 0000001b 00000003 00000000 00000360 c7bcf0a0 cd37aac0 00000000 00000011
00000000 c29fa000 c011763e 00000011 c29fa000 fffffff4 c1aa4480 ca5fdf60
00000008 0003aa21 00000001 ca5fc000 ca5fdfa8 bfffb7d8 ca5fc000 40013020
Call Trace: [<c011763e>] do_fork [kernel] 0x2ce (0xca5fdf6c))
[<c0107515>] sys_fork [kernel] 0x15 (0xca5fdfac))
[<c010893b>] system_call [kernel] 0x33 (0xca5fdfc0))
And now and then I see this messages (sometimes without any effects)
EXT3-fs error (device ide0(3,2)): ext3_free_blocks: Freeing block in system zone
- block = 2
fsck: contains a file system with errors, check forced.
<2>EXT3-fs error (device ide0(3,5)): ext3_free_blocks: Freeing blocks not
in datazone - block = 3471323520, count = 1
EXT3-fs error (device ide0(3,5)): ext3_free_blocks: Freeing blocks not in
datazone - block = 3275923520, count = 1
EXT3-fs error (device ide0(3,5)): ext3_free_blocks: Freeing blocks not in
datazone - block = 3326435648, count = 1
EXT3-fs error (device ide0(3,5)): ext3_free_blocks: Freeing blocks not in
datazone - block = 3422640768, count = 1
EXT3-fs error (device ide0(3,5)): ext3_free_blocks: Freeing blocks not in
datazone - block = 3479942536, count = 1
EXT3-fs error (device ide0(3,5)): ext3_free_blocks: Freeing blocks not in
datazone - block = 3251122320, count = 1
EXT3-fs error (device ide0(3,5)): ext3_free_blocks: Freeing blocks not in
datazone - block = 3251122320, count = 1
Kernel panic: EXT3-fs panic (device ide0(3,5)): load_block_bitmap: block_group
>= groups_count - block_group = 131071, groups_count = 47
EXT3-fs error (device ide0(3,5)): ext3_free_blocks: Freeing block in system zone
- block = 2
Can someone push me in a direction please?
Pascal